搜尋巢狀子文件

本節將介紹可用於搜尋深層巢狀文件的潛在技術,展示如何使用 Solr 的一些查詢剖析器和文件轉換器建構更複雜的查詢。

這些功能需要在綱要中宣告 _root__nest_path_。有關綱要和索引組態的詳細資訊,請參閱索引巢狀文件

本節不示範巢狀文件的分面。如需巢狀文件分面,請參閱區塊聯結分面計數一節。

查詢範例

在接下來的範例中,我們將假設索引包含與索引巢狀文件中涵蓋的相同文件

[{ "id": "P11!prod",
   "name_s": "Swingline Stapler",
   "description_t": "The Cadillac of office staplers ...",
   "skus": [ { "id": "P11!S21",
               "color_s": "RED",
               "price_i": 42,
               "manuals": [ { "id": "P11!D41",
                              "name_s": "Red Swingline Brochure",
                              "pages_i":1,
                              "content_t": "..."
                            } ]
             },
             { "id": "P11!S31",
               "color_s": "BLACK",
               "price_i": 3
             } ],
   "manuals": [ { "id": "P11!D51",
                  "name_s": "Quick Reference Guide",
                  "pages_i":1,
                  "content_t": "How to use your stapler ..."
                },
                { "id": "P11!D61",
                  "name_s": "Warranty Details",
                  "pages_i":42,
                  "content_t": "... lifetime guarantee ..."
                } ]
 },
 { "id": "P22!prod",
   "name_s": "Mont Blanc Fountain Pen",
   "description_t": "A Premium Writing Instrument ...",
   "skus": [ { "id": "P22!S22",
               "color_s": "RED",
               "price_i": 89,
               "manuals": [ { "id": "P22!D42",
                              "name_s": "Red Mont Blanc Brochure",
                              "pages_i":1,
                              "content_t": "..."
                            } ]
             },
             { "id": "P22!S32",
               "color_s": "BLACK",
               "price_i": 67
             } ],
   "manuals": [ { "id": "P22!D52",
                  "name_s": "How To Use A Pen",
                  "pages_i":42,
                  "content_t": "Start by removing the cap ..."
                } ]
 } ]

子文件轉換器

預設情況下,符合查詢的文件不會在回應中包含其任何巢狀子項。 [child] 文件轉換器可用於使用文件的後代來豐富查詢結果。

如需此轉換器的詳細說明,以及其語法和限制的詳細資訊,請參閱[child - ChildDocTransformerFactory] 一節。

一個簡單的查詢,符合所有描述包含「staplers」的文件

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers'
{
  "response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1672933224035123200}]
  }}

下面顯示具有 [child] 轉換器新增功能的相同查詢。請注意,numFound 沒有變更,我們仍然符合同一組文件,但當傳回這些文件時,巢狀子項也會作為虛擬欄位傳回。

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers&fl=*,[child]'
{
  "response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1672933224035123200,
        "skus":[
          {
            "id":"P11!S21",
            "color_s":"RED",
            "price_i":42,
            "_version_":1672933224035123200,
            "manuals":[
              {
                "id":"P11!D41",
                "name_s":"Red Swingline Brochure",
                "pages_i":1,
                "content_t":"...",
                "_version_":1672933224035123200}]},

          {
            "id":"P11!S31",
            "color_s":"BLACK",
            "price_i":3,
            "_version_":1672933224035123200}],
        "manuals":[
          {
            "id":"P11!D51",
            "name_s":"Quick Reference Guide",
            "pages_i":1,
            "content_t":"How to use your stapler ...",
            "_version_":1672933224035123200},

          {
            "id":"P11!D61",
            "name_s":"Warranty Details",
            "pages_i":42,
            "content_t":"... lifetime guarantee ...",
            "_version_":1672933224035123200}]}]
  }}

子查詢剖析器

{!child} 查詢剖析器可用於搜尋符合封裝查詢之父文件的後代文件。如需此剖析器的詳細說明,請參閱區塊聯結子查詢剖析器一節。

讓我們再次考慮上面使用的 description_t:staplers 查詢 — 如果我們將該查詢包裝在 {!child} 查詢剖析器中,那麼我們將不會「符合」並傳回產品層級文件,而是符合原始查詢的所有後代子文件

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'q={!child of="*:* -_nest_path_:*"}description_t:staplers'
{
  "response":{"numFound":5,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1672933224035123200},
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1672933224035123200},
      {
        "id":"P11!S31",
        "color_s":"BLACK",
        "price_i":3,
        "_version_":1672933224035123200},
      {
        "id":"P11!D51",
        "name_s":"Quick Reference Guide",
        "pages_i":1,
        "content_t":"How to use your stapler ...",
        "_version_":1672933224035123200},
      {
        "id":"P11!D61",
        "name_s":"Warranty Details",
        "pages_i":42,
        "content_t":"... lifetime guarantee ...",
        "_version_":1672933224035123200}]
  }}

在此範例中,我們使用 *:* -_nest_path_:* 作為我們的of 參數,以表示我們想要將所有沒有巢狀路徑的文件(即所有「根」層級文件)視為可能的父項集合。

透過變更 of 參數以符合特定 _nest_path_ 層級的祖先,我們可以縮小傳回的子項清單。在下面的查詢中,我們搜尋 skus 的所有後代(使用 of 參數來識別所有沒有/skus/* 作為前置詞的 _nest_path_ 的文件),且其 price_i 小於 50

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of="*:* -_nest_path_:\\/skus\\/*"}(+price_i:[* TO 50] +_nest_path_:\/skus)'
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1675662666752851968}]
  }}
of 中雙重跳脫 _nest_path_ 斜線

請注意,在上面的範例中,_nest_path_ 中的 / 字元在 of 參數中是「雙重跳脫」的

  • 需要一層 \ 跳脫,以防止將 / 解釋為正規表示式查詢

  • 額外的「跳脫跳脫字元」層級是必要的,因為 of 本機參數是帶引號的字串;因此,我們需要第二個 \ 以確保保留第一個 \ 並按原樣傳遞至查詢剖析器。

(您可以看到,只有在查詢字串的主體中才需要單一層級的 \ 跳脫 — 以防止正規表示式語法 — 因為它不是帶引號的字串本機參數)。

您可能會發現,搭配使用參數參照和不將 / 視為特殊字元的其他解析器,能以更詳細的形式表達相同的查詢,會更加方便。

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of=$block_mask}(+price_i:[* TO 50] +{!field f="_nest_path_" v="/skus"})' --data-urlencode 'block_mask=(*:* -{!prefix f="_nest_path_" v="/skus/"})'

父查詢解析器

{!child} 查詢解析器的反向是 {!parent} 查詢解析器,它讓您可以搜尋符合包裝查詢的某些子文件的祖先文件。有關此解析器的詳細說明,請參閱區塊連結父查詢解析器章節。

讓我們先考慮這個範例,搜尋所有具有 1 頁的 "manual" 類型文件

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select?omitHeader=true&q=pages_i:1'
{
  "response":{"numFound":3,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1676585794196733952},
      {
        "id":"P11!D51",
        "name_s":"Quick Reference Guide",
        "pages_i":1,
        "content_t":"How to use your stapler ...",
        "_version_":1676585794196733952},
      {
        "id":"P22!D42",
        "name_s":"Red Mont Blanc Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1676585794347728896}]
  }}

我們可以將該查詢包裝在 {!parent} 查詢中,以傳回所有作為這些手冊祖先的產品的詳細資訊

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
  "response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1676585794196733952},
      {
        "id":"P22!prod",
        "name_s":"Mont Blanc Fountain Pen",
        "description_t":"A Premium Writing Instrument ...",
        "_version_":1676585794347728896}]
  }}

在此範例中,我們使用 *:* -_nest_path_:* 作為我們的 which 參數,以表示我們希望將所有沒有巢狀路徑的文件(即所有「根」層級的文件)都視為可能的父文件集合。

透過變更 which 參數以匹配特定 _nest_path_ 層級的祖先,我們可以變更傳回的祖先類型。在下面的查詢中,我們搜尋 skus(使用一個識別所有不具有以 /skus/* 為前綴的 _nest_path_ 的文件的 which 參數),這些 skus 是具有確切 1 頁的 manuals 的祖先

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:\\/skus\\/*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
  "response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1676585794196733952},
      {
        "id":"P22!S22",
        "color_s":"RED",
        "price_i":89,
        "_version_":1676585794347728896}]
  }}

請注意,在上面的範例中,_nest_path_ 中的 / 字元在 which 參數中被「雙重跳脫」,原因與上面討論的關於 {!child} 解析器的 of 參數的相同原因

將區塊連結查詢解析器與子文件轉換器結合使用

這兩個解析器與 `[child]` 轉換器的組合,可以無縫地建立非常強大的查詢。

例如,這裡有一個查詢,其中

  • 傳回的(sku)文件必須具有「RED」的顏色

  • 傳回的(sku)文件必須是根層級(產品)文件的後代,這些文件具有

    • 內容中包含

      • 「終身保固」的直接子「手冊」文件

  • 每個傳回的(sku)文件還包含它擁有的任何後代(手冊)文件

$ curl 'https://127.0.0.1:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'fq=color_s:RED' --data-urlencode 'q={!child of="*:* -_nest_path_:*" filters=$parent_fq}' --data-urlencode 'parent_fq={!parent which="*:* -_nest_path_:*"}(+_nest_path_:"/manuals" +content_t:"lifetime guarantee")' -d 'fl=*,[child]'
{
  "response":{"numFound":1,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1676585794196733952,
        "manuals":[
          {
            "id":"P11!D41",
            "name_s":"Red Swingline Brochure",
            "pages_i":1,
            "content_t":"...",
            "_version_":1676585794196733952}]}]
  }}