Skip to content

SearchAnalyzer is not set during field mapping. #8499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alkampfergit opened this issue Apr 17, 2025 · 3 comments
Open

SearchAnalyzer is not set during field mapping. #8499

alkampfergit opened this issue Apr 17, 2025 · 3 comments
Labels
8.x Relates to a 8.x client version Category: Bug

Comments

@alkampfergit
Copy link

alkampfergit commented Apr 17, 2025

Elastic.Clients.Elasticsearch version: 8.17.4

Elasticsearch version: tried on both: 8.13.0 and 8.18.0

.NET runtime version: .NET 8

Operating system version: Windows 11

Description of the problem including expected versus actual behavior:

I'm moving from NEST for elastic7 to the new driver. I'm mapping a field with this code

        mapping.Properties["securityTokens"] = new TextProperty()
        {
            Analyzer = "not_analyzed_lowercase",
            SearchAnalyzer = "not_analyzed_lowercase",
        };

But the SearchAnalyzer settings seems to be missing, actually I've a unit test that read the mapping from the index to verify that everything is correct and SearchAnalyzer settings is null.

Expected behavior
SearchAnalyzer should be set correctly on index mapping. I've verified using the _mapping endpoint that the mapping is incorrect.

Image

@alkampfergit alkampfergit added 8.x Relates to a 8.x client version Category: Bug labels Apr 17, 2025
@flobernd
Copy link
Member

Hi @alkampfergit,

this is a weird one. Could you please post the JSON request that is made by the client?

You can inspect the response in the debugger and check the ApiCallDetails for that purpose.

@alkampfergit
Copy link
Author

Mapping is done with a call to this function (this is a unit test that aim is to check our compatibility with the driver, actually we are using NEST for version 2 of elastic, NEST for version 7 and we are adding version 8, yes we have customers with all three versions and we must be able to still use up to elastic 2 :) )

 await _elasticClient.Indices.CreateAsync

This is the full dump of the call. As you can see securityTokens has both analyzer and search analyzer.

Valid Elasticsearch response built from a successful (200) low level call on PUT: /test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16?pretty=true

# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://localhost:9800/ Took: 00:00:00.3328589
# Request:
{
  "mappings": {
    "dynamic_templates": [
      {
        "StringProperties": {
          "match": "s_*",
          "mapping": {
            "analyzer": "omnisearch_string_props",
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              },
              "raw": {
                "type": "keyword"
              },
              "nan": {
                "normalizer": "lowercase",
                "type": "keyword"
              }
            },
            "type": "text"
          }
        }
      },
      {
        "NumericProperties": {
          "match": "n_*",
          "mapping": {
            "type": "double"
          }
        }
      },
      {
        "DateProperties": {
          "match": "d_*",
          "mapping": {
            "type": "date"
          }
        }
      },
      {
        "dense_vector_1536": {
          "match": "v1536_*",
          "mapping": {
            "dims": 1536,
            "element_type": "float",
            "index": true,
            "similarity": "dot_product",
            "type": "dense_vector"
          }
        }
      },
      {
        "dense_vector_3072": {
          "match": "v3072_*",
          "mapping": {
            "dims": 3072,
            "element_type": "float",
            "index": true,
            "similarity": "dot_product",
            "type": "dense_vector"
          }
        }
      }
    ],
    "properties": {
      "title": {
        "normalizer": "lowercase",
        "type": "keyword"
      },
      "type": {
        "type": "keyword"
      },
      "checkpointToken": {
        "type": "long"
      },
      "secondaryUpdateToken": {
        "type": "long"
      },
      "lastUpdated": {
        "type": "date"
      },
      "deleted": {
        "store": false,
        "type": "boolean"
      },
      "unsercured": {
        "type": "boolean"
      },
      "offline": {
        "type": "boolean"
      },
      "index": {
        "type": "keyword"
      },
      "ngrammed": {
        "analyzer": "trigram_standard",
        "type": "text"
      },
      "payload": {
        "index": false,
        "type": "keyword"
      },
      "securityTokens": {
        "analyzer": "not_analyzed_lowercase",
        "search_analyzer": "not_analyzed_lowercase",
        "type": "text"
      },
      "mainSearch": {
        "analyzer": "omnisearch_mainsearch",
        "fields": {
          "edge_n_gram": {
            "analyzer": "edge_ngram_standard_analyzer",
            "norms": false,
            "search_analyzer": "omnisearch_mainsearch",
            "type": "text"
          },
          "raw": {
            "normalizer": "lowercase",
            "type": "keyword"
          },
          "na": {
            "analyzer": "not_analyzed_lowercase",
            "type": "text"
          },
          "std": {
            "analyzer": "standard",
            "type": "text"
          }
        },
        "type": "text"
      },
      "mainsearch_it": {
        "analyzer": "italian",
        "type": "text"
      },
      "mainsearch_en": {
        "analyzer": "english",
        "type": "text"
      },
      "mainsearch_de": {
        "analyzer": "german",
        "type": "text"
      },
      "mainsearch_ru": {
        "analyzer": "russian",
        "type": "text"
      },
      "fulltext_it": {
        "analyzer": "italian",
        "type": "text"
      },
      "fulltext_en": {
        "analyzer": "english",
        "type": "text"
      },
      "fulltext_de": {
        "analyzer": "german",
        "type": "text"
      },
      "fulltext_ru": {
        "analyzer": "russian",
        "type": "text"
      },
      "nested": {
        "properties": {
          "name": {
            "analyzer": "not_analyzed_lowercase",
            "type": "text"
          },
          "depth": {
            "type": "integer"
          },
          "path": {
            "analyzer": "omnisearch_path_analyzer",
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              }
            },
            "type": "text"
          },
          "svalue": {
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              }
            },
            "type": "keyword"
          },
          "nvalue": {
            "type": "double"
          },
          "dvalue": {
            "type": "date"
          }
        },
        "type": "nested"
      },
      "internalData": {
        "index": false,
        "store": true,
        "type": "text"
      },
      "relatedIds": {
        "type": "keyword"
      }
    }
  },
  "settings": {
    "analysis": {
      "analyzer": {
        "standard_analyzer": {
          "type": "standard"
        },
        "omnisearch_path_analyzer": {
          "filter": "lowercase_filter",
          "tokenizer": "jarvis_path_tokenizer",
          "type": "custom"
        },
        "not_analyzed_lowercase": {
          "filter": [
            "lowercase_filter",
            "asciifolding"
          ],
          "tokenizer": "keyword_tokenizer",
          "type": "custom"
        },
        "omnisearch_mainsearch": {
          "filter": [
            "lowercase",
            "asciifolding"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "omnisearch_string_props": {
          "filter": [
            "lowercase",
            "asciifolding"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "edge_ngram_standard_analyzer": {
          "filter": [
            "lowercase_filter",
            "asciifolding",
            "edge_ngram_filter_standard"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "trigram_standard": {
          "filter": "lowercase_filter",
          "tokenizer": "trigram_tokenizer",
          "type": "custom"
        },
        "omni_property_indexTime": {
          "filter": "lowercase",
          "tokenizer": "standard",
          "type": "custom"
        }
      },
      "filter": {
        "lowercase_filter": {
          "type": "lowercase"
        },
        "edge_ngram_filter_standard": {
          "max_gram": 15,
          "min_gram": 2,
          "type": "edge_ngram"
        },
        "trim_zero_chars": {
          "max": 100,
          "min": 1,
          "type": "length"
        }
      },
      "tokenizer": {
        "jarvis_path_tokenizer": {
          "delimiter": "/",
          "type": "path_hierarchy"
        },
        "keyword_tokenizer": {
          "type": "keyword"
        },
        "edge_ngram_tokenizer": {
          "max_gram": 10,
          "min_gram": 3,
          "type": "edge_ngram"
        },
        "trigram_tokenizer": {
          "max_gram": 3,
          "min_gram": 3,
          "type": "ngram"
        },
        "non_ascii_and_space_split_lowercase_tokenizer": {
          "flags": "CASE_INSENSITIVE|MULTILINE",
          "group": -1,
          "pattern": "(?\u003C=[^\\p{ASCII}]|\\s)",
          "type": "pattern"
        }
      }
    },
    "number_of_replicas": 1,
    "number_of_shards": 1
  }
}
# Response:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16"
}

Then doing the classic mapping requestl

http://localhost:9800/test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16/_mapping

I got this response

{
  "test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16": {
    "mappings": {
      "dynamic_templates": [
        {
          "StringProperties": {
            "match": "s_*",
            "mapping": {
              "analyzer": "omnisearch_string_props",
              "fields": {
                "na": {
                  "analyzer": "not_analyzed_lowercase",
                  "type": "text"
                },
                "raw": {
                  "type": "keyword"
                },
                "nan": {
                  "normalizer": "lowercase",
                  "type": "keyword"
                }
              },
              "type": "text"
            }
          }
        },
        {
          "NumericProperties": {
            "match": "n_*",
            "mapping": {
              "type": "double"
            }
          }
        },
        {
          "DateProperties": {
            "match": "d_*",
            "mapping": {
              "type": "date"
            }
          }
        },
        {
          "dense_vector_1536": {
            "match": "v1536_*",
            "mapping": {
              "dims": 1536,
              "element_type": "float",
              "index": true,
              "similarity": "dot_product",
              "type": "dense_vector"
            }
          }
        },
        {
          "dense_vector_3072": {
            "match": "v3072_*",
            "mapping": {
              "dims": 3072,
              "element_type": "float",
              "index": true,
              "similarity": "dot_product",
              "type": "dense_vector"
            }
          }
        }
      ],
      "properties": {
        "checkpointToken": {
          "type": "long"
        },
        "deleted": {
          "type": "boolean"
        },
        "fulltext_de": {
          "type": "text",
          "analyzer": "german"
        },
        "fulltext_en": {
          "type": "text",
          "analyzer": "english"
        },
        "fulltext_it": {
          "type": "text",
          "analyzer": "italian"
        },
        "fulltext_ru": {
          "type": "text",
          "analyzer": "russian"
        },
        "index": {
          "type": "keyword"
        },
        "internalData": {
          "type": "text",
          "index": false,
          "store": true
        },
        "lastUpdated": {
          "type": "date"
        },
        "mainSearch": {
          "type": "text",
          "fields": {
            "edge_n_gram": {
              "type": "text",
              "norms": false,
              "analyzer": "edge_ngram_standard_analyzer",
              "search_analyzer": "omnisearch_mainsearch"
            },
            "na": {
              "type": "text",
              "analyzer": "not_analyzed_lowercase"
            },
            "raw": {
              "type": "keyword",
              "normalizer": "lowercase"
            },
            "std": {
              "type": "text",
              "analyzer": "standard"
            }
          },
          "analyzer": "omnisearch_mainsearch"
        },
        "mainsearch_de": {
          "type": "text",
          "analyzer": "german"
        },
        "mainsearch_en": {
          "type": "text",
          "analyzer": "english"
        },
        "mainsearch_it": {
          "type": "text",
          "analyzer": "italian"
        },
        "mainsearch_ru": {
          "type": "text",
          "analyzer": "russian"
        },
        "nested": {
          "type": "nested",
          "properties": {
            "depth": {
              "type": "integer"
            },
            "dvalue": {
              "type": "date"
            },
            "name": {
              "type": "text",
              "analyzer": "not_analyzed_lowercase"
            },
            "nvalue": {
              "type": "double"
            },
            "path": {
              "type": "text",
              "fields": {
                "na": {
                  "type": "text",
                  "analyzer": "not_analyzed_lowercase"
                }
              },
              "analyzer": "omnisearch_path_analyzer"
            },
            "svalue": {
              "type": "keyword",
              "fields": {
                "na": {
                  "type": "text",
                  "analyzer": "not_analyzed_lowercase"
                }
              }
            }
          }
        },
        "ngrammed": {
          "type": "text",
          "analyzer": "trigram_standard"
        },
        "offline": {
          "type": "boolean"
        },
        "payload": {
          "type": "keyword",
          "index": false
        },
        "relatedIds": {
          "type": "keyword"
        },
        "secondaryUpdateToken": {
          "type": "long"
        },
        "securityTokens": {
          "type": "text",
          "analyzer": "not_analyzed_lowercase"
        },
        "title": {
          "type": "keyword",
          "normalizer": "lowercase"
        },
        "type": {
          "type": "keyword"
        },
        "unsercured": {
          "type": "boolean"
        }
      }
    }
  }
}

If I have time I'll try to reproduce on a simple onefile project.

@flobernd
Copy link
Member

Hi @alkampfergit , thanks for providing the JSON request/response payloads.

The request produced by the Indices.CreateAsync correctly serializes the search_analyzer field which means that this is not a client error.

Just to triple check, could you please execute the exact same request using curl or in the Kibana Dev Console? I strongly expect this to produce the same result.

Unfortunately I don't know why the server does not seem to save the search_analyzer setting. To clarify this, you might probably want to contact support or ask in our discuss forums.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.x Relates to a 8.x client version Category: Bug
Projects
None yet
Development

No branches or pull requests

2 participants