# How to Deploy NLP: Text Embeddings and Vector Search ![[Attachments/c02ef4c1affc7aed70841e350f898e6a_MD5.png]] ## Metadata - Author: [[Elastic Search Labs]] - Full Title: How to Deploy NLP: Text Embeddings and Vector Search - Category: #articles - Date: 2023-10-09 - URL: https://www.elastic.co/search-labs/how-to-deploy-nlp-text-embeddings-and-vector-search - [ ] #toFile ➕ 2023-10-09 - [ ] #toProcess ➕ 2023-10-09 ## Highlights added 2023-10-09 - How to deploy NLP: Text Embeddings and Vector Search ([View Highlight](https://read.readwise.io/read/01hcb395m9jve1cf55hvtbj8ss)) - Note: Claude summary: Here is a summary of the key points from the article: - The article walks through an example of using text embeddings and vector similarity search for natural language processing in Elasticsearch. - It uses a sentence transformer model from Hugging Face to generate vector representations of text passages from the MS MARCO dataset. - The passages are indexed in Elasticsearch with their vector embeddings. - Vector similarity search is then demonstrated by using the embeddings to find semantically similar passages to sample queries. - The results are verified against relevance judgements from the MS MARCO dataset to confirm that relevant passages are being returned. - The example shows how to deploy a text embedding model, create an ingest pipeline to generate embeddings, index data, and then search using vector similarity. - It provides a practical walkthrough for setting up semantic search capabilities in Elasticsearch using modern NLP models like sentence transformers. Here are some key takeaways for developers looking to build semantic search capabilities like embeddings and vector similarity search: - Pre-trained NLP models like sentence transformers provide an easy way to generate vector representations of text. These models only need the text as input and can handle embedding full passages. - The vector embeddings need to be indexed in dense_vector fields in Elasticsearch to enable efficient similarity search. The mapping should be defined upfront. - Ingest pipelines with inference processors allow transparently generating embeddings during indexing. This avoids a separate embedding step. - Vector similarity search is available through the _knn_search API. This finds vectors closest to the query vector based on cosine similarity. - Generating embeddings for search queries on-the-fly is not yet supported. The current process is a two-step approach of generating the query embedding separately. - Approximate nearest neighbor search techniques like HNSW provide fast and efficient similarity search for vectors. - Evaluating search quality on real-world labeled data is important for validation. The MS MARCO dataset provides a good benchmark for passage ranking/retrieval. - Semantic search significantly improves discovery by going beyond keywords. But some amount of keywords can still be useful for filtering and narrowing search. - The techniques can scale to millions of documents given Elasticsearch's distributed search capabilities. Relevance tuning and ranking is an ongoing research area.