๊ฒ€์ƒ‰์—”์ง„ 6

[Elastic Search] MBTI ๊ฒ€์ƒ‰ ํ”„๋กœ์ ํŠธ - 3. API ๊ตฌ์ถ•ํ•˜๊ธฐ

MBTI ๊ฒ€์ƒ‰์—”์ง„ ๋ฐ์ดํ„ฐ๋ฅผ API ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. API ๋ชฉ๋ก ์ „์ฒด ๋ฌธ์„œ์—์„œ ๊ฐ MBTI ํƒ€์ž…๋ณ„ ์ƒ์œ„ 100๊ฐœ ํ‚ค์›Œ๋“œ ์ถœ๋ ฅ - ๋ฌธ์„œ ์ˆ˜ ๊ธฐ์ค€ MBTI ์œ ํ˜• ์ค‘ E ๋˜๋Š” I ์œ ํ˜•์— ๋”ฐ๋ผ ์ƒ์œ„ 100๊ฐœ ํ‚ค์›Œ๋“œ ์ถœ๋ ฅ MBTI ์œ ํ˜• ์ค‘ E ๋˜๋Š” I ์œ ํ˜•์— ๋”ฐ๋ผ ๊ฒ€์ƒ‰์–ด๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ๊ฒ€์ƒ‰๋œ ์ƒ์œ„ 100๊ฐœ ํ‚ค์›Œ๋“œ ์ถœ๋ ฅ ๊ตฌํ˜„ํ•œ API๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ „์ฒด ๋ฌธ์„œ์—์„œ ๊ฐ MBTI ํƒ€์ž…๋ณ„ ์ƒ์œ„ 100๊ฐœ ํ‚ค์›Œ๋“œ๋ฅผ ์ถœ๋ ฅ @app.get('/top/keywords/{mbti_type}') def get_top_keywords(mbti_type: str, q:Optional[str]=None): es_query = { "size": 0, "query": {"match": {"keyword": mbti_type}}, "aggs..

Elastic Search 2022.04.29

[Elastic Search] MBTI ๊ฒ€์ƒ‰ ํ”„๋กœ์ ํŠธ - 2. Emoji ๊ฒ€์ƒ‰ ๋ฐ Aggregation(3ํŽธ)

Re-Index ๋ง‰์ƒ ์ด๋ชจํ‹ฐ์ฝ˜ ๊ฒ€์ƒ‰์„ ํ•ด๋ณด๋‹ˆ ๊ฐ ์›๋ฌธ์—์„œ ์ด๋ชจํ‹ฐ์ฝ˜์ด ์–ผ๋งˆ๋‚˜ ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€, ์–ด๋–ค ์ด๋ชจํ‹ฐ์ฝ˜์ด ๊ฐ€์žฅ ๋งŽ์ด ์žˆ๋Š”์ง€ ๊ฒ€์ƒ‰ํ•ด๋ณด์ž ๊ทธ์ „์— ์‚ฌ์ „ ์ค€๋น„ ์ž‘์—…์œผ๋กœ text field๋กœ ๋“ค์–ด๊ฐ„ ๋ฐ์ดํ„ฐ์—์„œ ํ‚ค์›Œ๋“œ๋ฅผ ์ถ”์ถœ(Es ๋‚ด๋ถ€์—์„œ๋Š” Term)ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ธ๋ฑ์Šค๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์ „์ฒด ๋ฌธ์„œ์—์„œ ํ‚ค์›Œ๋“œ ๋นˆ๋„์ˆ˜๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์•„๋ด…๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค ๊ตฌ์„ฑ PUT /mbti_term { "settings": { "analysis": { "analyzer": { "nori_mixed": { "tokenizer": "nori_t_mixed", "filter": "shingle" }, "nori_pos_noun": { "type": "custom", "tokenizer": "nori_t_mixed", "..

Elastic Search 2022.04.24

[Elastic Search] MBTI ๊ฒ€์ƒ‰ ํ”„๋กœ์ ํŠธ - 2. Emoji ๊ฒ€์ƒ‰ ๋ฐ Aggregation(2ํŽธ)

๊ธฐ์กด ์ฝ˜ํ…์ธ ์—์„œ ์ด๋ชจํ‹ฐ์ฝ˜๋งŒ ํŒŒ์‹ฑ ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ RDB์— ์ˆ˜์ง‘ํ•˜์˜€์Šต๋‹ˆ๋‹ค. (ES Analyzer์— regex filter๋ฅผ ์ ์šฉํ•˜์—ฌ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์€ ๋‹ค์Œ์— ์ง„ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!) ์Šคํ‚ค๋งˆ๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑํ•˜๊ณ  ๋ฐ์ดํ„ฐ๋ฅผ Insert ํ•˜์˜€์Šต๋‹ˆ๋‹ค(RDB) T: t_emoji_dashboard Columns: emoji, mbti_type(MBTI ํƒ€์ž… ์ž…๋‹ˆ๋‹ค), emoji_count(๊ฐ ๋ฌธ์„œ๋ณ„ ๋“ฑ์žฅ ํšŸ์ˆ˜์ž…๋‹ˆ๋‹ค) SELECT emoji, mbti_type, sum(emoji_count) FROM t_emoji_dashboard WHERE emoji = '๐Ÿ˜˜' GROUP BY emoji, mbti_type ORDER BY mbti_type, sum DESC ์‚ฌ์šฉํ•œ ์ฟผ๋ฆฌ๋กœ ์กฐํšŒํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. (ํŠน์ • ์ด๋ชจํ‹ฐ์ฝ˜๋งŒ ์กฐํšŒํ•˜์˜€์Šต๋‹ˆ..

Elastic Search 2022.04.21

[Elastic Search] MBTI ๊ฒ€์ƒ‰ ํ”„๋กœ์ ํŠธ - 1. ๊ฒ€์ƒ‰ Score ํŠœ๋‹

ํ˜„์žฌ ์—˜๋ผ์Šคํ‹ฑ์„œ์น˜๋ฅผ ์ด์šฉํ•ด ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ(MBTI ํƒ€์ž…๋ณ„ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ)๋ฅผ ์กฐํšŒํ•˜๋Š” ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ MBTI ํƒ€์ž…๊ณผ ์Šค๋งˆํŠธํฐ(์•„์ดํฐ ๋˜๋Š” ๊ฐค๋Ÿญ์‹œ)์˜ ์ƒ๊ด€์„ฑ์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด ES์˜ ์ฟผ๋ฆฌ๋ฅผ ํŠœ๋‹ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค ๊ตฌ์„ฑ ์ฝ˜ํ…์ธ  ๋‚ด๋ถ€์—์„œ ๋ช…์‚ฌ๋งŒ ์ถ”์ถœํ•˜์—ฌ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด nori_noun์ด๋ผ๋Š” ๋ถ„์„๊ธฐ๋ฅผ ๋ณ„๋„๋กœ ์ƒ์„ฑํ•˜์—ฌ ํ•„๋“œ๋กœ ์„ค์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. { "mbti" : { "aliases" : { }, "mappings" : { "properties" : { "comment_cnt" : { "type" : "integer" }, "contents" : { "type" : "text", "fields" : { "full" : { "type" : "keyword" }, "nori_mixed" : { "t..

Elastic Search 2022.04.12

[Elastic Search] ๊ฒ€์ƒ‰ ๊ตฌํ˜„ํ•˜๊ธฐ(with Fast API)

ES๋กœ ๊ฒ€์ƒ‰์—”์ง„์„ ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ๊ตฌํ˜„๋œ ๊ฒ€์ƒ‰ ์—”์ง„์„ ์‹ค์ œ ์„œ๋น„์Šค์ฒ˜๋Ÿผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด REST API๋ฅผ ๊ตฌํ˜„ํ•ด๋ณด์ž. REST API์˜ ๋กœ์ง์€ ๋‹จ์ˆœํ•˜๊ฒŒ ๋ณธ๋‹ค๋ฉด 2๋‹จ๊ณ„์ด๋‹ค. 1. ์‚ฌ์šฉ์ž๊ฐ€ ๊ฒ€์ƒ‰ ํ‚ค์›Œ๋“œ๋ฅผ ์ž…๋ ฅํ•œ๋‹ค. 2. ๊ฒ€์ƒ‰ ํ‚ค์›Œ๋“œ์— ํ•ด๋‹นํ•˜๋Š” ๋ฌธ์„œ๋ฅผ ์ฐพ๋Š”๋‹ค. ์‚ฌ์šฉ์ž ์ž…๋ ฅ ๊ตฌํ˜„ ์‚ฌ์šฉ์ž ์ž…๋ ฅ ๊ตฌํ˜„์—์„œ ๊ณ ๋ คํ•  ์ ์€ ๋‹จ์ˆœํ•˜๊ฒŒ ํ•˜๋‚˜์˜ ํ‚ค์›Œ๋“œ๋งŒ ์ž…๋ ฅ๋ฐ›์•„์„œ ๋ฌธ์„œ๋ฅผ ์ƒ์„ธํ•˜๊ฒŒ ๊ฒ€์ƒ‰ํ•  ์ˆ˜๋Š” ์—†๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์œ ์‚ฌ์–ด, ์ œ์™ธ์–ด, ์—ฌ๋Ÿฌ ํ‚ค์›Œ๋“œ, And ์กฐ๊ฑด, Or ์กฐ๊ฑด ๋“ฑ ๋‹ค์–‘ํ•œ ์กฐ๊ฑด์œผ๋กœ ๊ฒ€์ƒ‰์ด ๊ฐ€๋Šฅํ•˜๋ฉด ์‚ฌ์šฉ์ž์—๊ฒŒ ๋” ์ข‹์€ ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ฒ€์ƒ‰ ํ‚ค์›Œ๋“œ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํŠน์ˆ˜ ์ปค๋งจ๋“œ๋ฅผ ์ •๋ฆฌํ•ด๋ณด์•˜๋‹ค. ๋‹จ์ผ ๊ฒ€์ƒ‰: search= ์˜ˆ) ๋งจํˆฌ๋งจ ์œ ์‚ฌ์–ด ๊ฒ€์ƒ‰: search= ์˜ˆ) ๋งจํˆฌ๋งจ +์•„๋””๋‹ค์Šค => ๋งจํˆฌ๋งจ์ด ํฌํ•จ๋œ ๋ฌธ์„œ์—์„œ..

Elastic Search 2022.02.15

[Elastic Search] Nori Tokenizer & Filter ์ ์šฉ๊ธฐ

์ด์ „ ๊ธ€์—์„œ Elastic Search์˜ ์ฟผ๋ฆฌ๋“ค์„ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ์กฐ๊ธˆ ๋” ์ž์„ธํ•˜๊ฒŒ ๋ฐ์ดํ„ฐ ์กฐํšŒ๋ฅผ ํ•ด๋ณด๊ณ  ์‹ถ์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ €์žฅ๋œ ํ…์ŠคํŠธ๋“ค์— ํ•œ๊ธ€ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ์ ์šฉํ•˜์—ฌ ๊ฒ€์ƒ‰์„ ์ข€ ๋” ์ž์„ธํžˆ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์•„๋ณด์•˜๋‹ค. Elastic Search ํ•œ๊ธ€ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ Elastic Search 7.0 ์ดํ›„ ๋ฒ„์ „๋ถ€ํ„ฐ๋Š” Nori(๋…ธ๋ฆฌ)๋ผ๋Š” ํ•œ๊ธ€ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. (๊ณต์‹์ ์œผ๋กœ๋Š” 6.6 ๋ฒ„์ „ ์ดํ›„๋ถ€ํ„ฐ ์ œ๊ณต) Nori์˜ ์„ค์น˜๋Š” ์•„๋ž˜ ๋งํฌ๋ฅผ ์ฐธ์กฐํ•˜์—ฌ ์ง„ํ–‰ํ•œ๋‹ค. https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori.html ํ˜„ ์ƒํ™ฉ ๊ธฐ์กด์—๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์ธ๋ฑ์Šค & ๋ถ„์„๊ธฐ๋ฅผ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. { "settings": { "inde..

Elastic Search 2022.02.09