Elasticsearch-ik同义词,近义词

1.配置:

创建同义词字段目录:

/usr/local/elasticsearch-7.7.1/config/analysis

其中analysis为需要创建的目录,

新建synonym.dic同义词字典文件

编辑并添加如下信息:

西红柿,番茄,tomato

马铃薯,土豆

 

保存文件。如上表示添加两个同义词,即西红柿和马铃薯。

2.使用:

PHP添加同义词索引

//创建同义词索引
        $params = [
            ‘index’ => ‘syno’,
            ‘body’ => [
                ‘settings’ => [
                    “analysis” =>[
                        “filter” =>[
                            “my_synonym_filter” =>[
                                “type” => “synonym”,
                                “synonyms_path”  => “analysis/synonym.dic”
                            ]
                        ],
                        “analyzer” =>[
                            “my_synonyms” =>[
                                “tokenizer” => “ik_max_word”,
                                “filter” => [
                                    “lowercase”,
                                    “my_synonym_filter”
                                ]
                            ],
                            “my_synonyms_smart” =>[
                                “tokenizer” => “ik_smart”,
                                “filter” => [
                                    “lowercase”,
                                    “my_synonym_filter”
                                ]
                            ]
                        ]
                    ]
                ],
                ‘mappings’ => [ // 映射是另外一个嵌套在body中的顶级元素,包含各种类型的映射
                    ‘properties’ => [
                        “title”=> [
                            “type”=> “text”,
                            “analyzer”=> “my_synonyms”
                        ],
                        “content”=> [
                            “type”=> “text”,
                            “analyzer”=> “my_synonyms”,
                            “search_analyzer”=> “my_synonyms”
                        ]
                    ]
                ]
            ]
        ];
        $response = $client->indices()->create($params);

添加索引文档:

//索引文档
        $params = [
            ‘index’ => ‘syno’,
            // ‘type’ => ‘my_type’,
            ‘type’ => ‘_doc’,
            ‘id’ => ’20’,
            ‘body’ => [‘title’ => ‘张大仙爱吃土豆’,”content”=>”张大仙和锅老师都喜欢西红柿”]
        ];
        $response = $client->index($params);
查看分词情况:
curl http://127.0.0.1:9200/my_index/_doc/20/_termvectors?fields=content
参考文章: