음성음운형태론연구 29집 3호 박선우

2024.01.31 09:05

홍보이사_2 조회 수:171

DOI: http://dx.doi.org/10.17959/sppm.2023.29.3.329

PDF: 본문파일

음소배열정보 기반 한국어 고유어, 한자어, 차용어의 머신러닝 분류

박선우 (계명대학교)

Abstract

The purpose of this study is to test models that automatically classify Korean nouns

into native Korean, Sino-Korean, and loanwords by applying a machine learning

model, naïve Bayes classification. In this study, 500 native Korean words, Sino-

Korean words, and loanwords were collected, and after romanizing and decomposing

them into bigram and trigram lists, the bigrams and trigrams were entered into the

naïve Bayes classifier. We tested models with and without syllable boundaries, and

found that both the bigram and trigram models were over 80% accurate. Contrary to

the expectation that the performance of the models would improve as more

information about Korean phonotactics was included in the training and validation

data, the difference in performance between the bigram and trigram models was not

significant. The model that included syllable boundaries in the phoneme sequence

information had slightly higher accuracy than the model without syllable boundary

information. When comparing the classification results of all five models, the

accuracy of the bigram model with syllable boundaries was 83.55%, which was the

best. For now, we have modified the model to consider only phoneme sequence

information and syllable boundaries, but it is expected that the accuracy of the model

can be improved by training the model while excluding bigrams and trigrams, which

occur in similar proportions in all categories, and by increasing the size of the data.

Keywords

phonotactics, native Korean, Sino-Korean, loanword, machine learning, Naïve Bayes classification, bigram model, trigram model

이 게시물을...

번호	제목	글쓴이	날짜	조회 수
공지	[음성음운형태론연구] 온라인 논문 투고 안내 (2023년 1월 14일 수정)	Manager	2016.09.02	31944
공지	[음성음운형태론연구] 논문 투고시 유의사항 (2023년 1월 14일 수정)	Manager	2013.04.27	40496
688	음성음운형태론연구 21집 1호 William Hart	Manager	2015.07.10	42779
687	음성.음운.형태론연구 논문 투고 안내	Manager	2013.02.05	14815
686	음성음운형태론연구 24집 1호 하영우	Manager	2018.05.31	10091
685	음성음운형태론연구 18집 3호 Hong, Soonhyun	관리자	2013.02.06	7485
684	음성음운형태론연구 21집 2호 고언숙	Manager	2015.09.12	6670
683	음성음운형태론연구 19집 3호 김태경	Manager	2013.12.27	5607
682	음성음운형태론연구 19집 2호 목록	관리자	2013.08.29	5412
681	음성.음운.형태론연구 12집 3호 목록	관리자	2013.02.06	5398
680	음성.음운.형태론연구 17집 3호 Kim, Gyung-Ran	관리자	2013.02.06	5396
679	음성음운형태론연구 20집 1호 김미란, 최재웅, 홍정하	Manager	2014.04.27	5306
678	음성.음운.형태론연구 11집 2호, 김태경	관리자	2013.02.06	5290
677	음성음운형태론연구 19집 3호 Cho, Hyesun	Manager	2013.12.27	5266
676	음성.음운.형태론연구 12집 2호 목록	관리자	2013.02.06	5183
675	음성.음운.형태론연구 12집 1호, 전종호·이혜민	관리자	2013.02.06	5154
674	음성.음운.형태론연구 10집 3호 목록	관리자	2013.02.06	5139
673	음성음운형태론연구 19집 2호 Lee, Ponghyung	관리자	2013.08.29	5138
672	음성.음운.형태론연구 16집 2호 변군혁.안상철	관리자	2013.02.06	5134
671	음성.음운.형태론연구 6집 1호, Hyeonkwan Cho	관리자	2013.02.05	5068

첫 페이지 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 끝 페이지

쓰기...

태그

한국음운론학회