Ideas for Arc XP

Better stemming for the Content API Elasticsearch endpoints

The usual Elasticsearch English text analyzer features are not available. This leads to a substantial increase in complexity for query generation code, and a decrease in result quality. Example: possesives: "Amber's" searches differently from "Amber". We'd also like to have diacritics stripped, etc. The default "English" analyzer in Elasticsearch would be perfect ( but just possesives and diacritics would be a huge help.

Note that while DMN has Spanish-language content, the English analyzer would work much better than the current configuration.

Allowing more control over Elasticsearch ingestion would be extremely helpful, but just having a better set of defaults would solve most of the immediate problems.

Note the above refers to the Content API endpoints, *not* the Site Search endpoint.

  • Christopher St. John
  • Dec 12 2019
  • Will not implement
  • Attach files
  • Guest commented
    9 Feb, 2021 02:57am
  • Lucas Kerdo commented
    19 Feb, 2020 03:06pm

    We are having the same problem with French language (IPM Group). Could you have an explanation ? or an update ?

  • Claire Campbell commented
    23 Dec, 2019 06:27pm

    Since this just got marked as "Will not implement" could we get an explanation of why?