09 Detailed Explanation of Query DSL Term Queries

09 Detailed Explanation of Query DSL - Term Queries #

DSL (Domain Specific Language) is a type of query language used for searching and retrieving data from databases. In this section, we will explain the Term query in DSL.

The Term query is used to search for exact matches of terms in a specific field. It is often used for searching against fields that are not analyzed, such as a keyword field.

To use the Term query, you need to specify the field name and the exact term you want to search for. Here is an example:

{
  "query": {
    "term": {
      "field_name": "exact_term"
    }
  }
}

In the above example, “field_name” is the field in which you want to search, and “exact_term” is the exact term you want to find.

It’s important to note that the Term query is case-sensitive, meaning it will only match terms with the exact same casing. If you want a case-insensitive search, you can use the “keyword” analyzer.

The Term query is useful when you want to find exact matches and don’t need any analysis or scoring. However, it may not be suitable for fields with analyzed text, as it only matches exact terms.

This is a brief overview of the Term query in DSL queries. In the next sections, we will discuss more advanced query types and their usage.

Introduction to Term Query #

As mentioned earlier, there are two types of queries: text queries and term queries:

img

This article mainly focuses on term queries.

img

Term Query #

Many terms are commonly used and not difficult, but they need to be understood in conjunction with examples. Here, I have designed a test scenario data by integrating the content of the official documentation to cover all examples. @pdai

Prepare data

PUT /test-dsl-term-level
{
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword"
      },
      "programming_languages": {
        "type": "keyword"
      },
      "required_matches": {
        "type": "long"
      }
    }
  }
}

POST /test-dsl-term-level/_bulk
{ "index": { "_id": 1 }}
{"name": "Jane Smith", "programming_languages": [ "c++", "java" ], "required_matches": 2}
{ "index": { "_id": 2 }}
{"name": "Jason Response", "programming_languages": [ "java", "php" ], "required_matches": 2}
{ "index": { "_id": 3 }}
{"name": "Dave Pdai", "programming_languages": [ "java", "c++", "php" ], "required_matches": 3, "remarks": "hello world"}

Existence of a Field: exist #

The index values of a document field may not exist for several reasons:

  • The field in the source JSON is null or []
  • The field has been set to “index”: false in the mapping
  • The length of the field value exceeds the “ignore_above” setting in the mapping
  • The field value is in an incorrect format and “ignore_malformed” is defined in the mapping

So exist represents the search for the existence of a field.

img

ID Query: ids #

The ids query is used to search for documents by their IDs.

GET /test-dsl-term-level/_search
{
  "query": {
    "ids": {
      "values": [3, 1]
    }
  }
}

img

Prefix Query: prefix #

Search for a field by prefix.

GET /test-dsl-term-level/_search
{
  "query": {
    "prefix": {
      "name": {
        "value": "Jan"
      }
    }
  }
}

img

Term Query: term #

The most common way to search by keyword.

GET /test-dsl-term-level/_search
{
  "query": {
    "term": {
      "programming_languages": "php"
    }
  }
}

img

Terms Query: terms #

Search by multiple terms. The terms are connected by OR.

GET /test-dsl-term-level/_search
{
  "query": {
    "terms": {
      "programming_languages": ["php","c++"]
    }
  }
}

img

Term Set Query on a Numeric Field: terms set #

This query is designed to dynamically match the terms in the document’s numeric field.

GET /test-dsl-term-level/_search
{
  "query": {
    "terms_set": {
      "programming_languages": {
        "terms": [ "java", "php" ],
        "minimum_should_match_field": "required_matches"
      }
    }
  }
}

img

Wildcard Query: wildcard #

Wildcard matching using * as the wildcard character.

GET /test-dsl-term-level/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "D*ai",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}

img

Range Query: range #

Commonly used for numeric or date range queries.

GET /test-dsl-term-level/_search
{
  "query": {
    "range": {
      "required_matches": {
        "gte": 3,
        "lte": 4
      }
    }
  }
}

img

Regexp Query: regexp #

Search using a regular expression.

Search for name fields starting with “Jan”.

GET /test-dsl-term-level/_search
{
  "query": {
    "regexp": {
      "name": {
        "value": "Ja.*",
        "case_insensitive": true
      }
    }
  }
}

img

Fuzzy Matching: fuzzy #

According to the official documentation, fuzzy matching is based on edit distance, which is the number of character changes required to convert one term to another. These changes can include:

  • Changing a character (box→ fox)
  • Removing a character (black→ lack)
  • Inserting a character (sic→ sick)
  • Transposing two adjacent characters (act→ cat)
GET /test-dsl-term-level/_search
{
  "query": {
    "fuzzy": {
      "remarks": {
        "value": "hell"
      }
    }
  }
}

img

Reference Article #

https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html