Sentence Segmentation from Unformatted Text using Language Modeling and Sequence Labeling Approaches

Iosifov, Ievgen and Iosifova, Olena and Sokolov, Volodymyr (2020) Sentence Segmentation from Unformatted Text using Language Modeling and Sequence Labeling Approaches 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T), 1 (1). pp. 335-337. ISSN 978-172819177-5

[thumbnail of program_picst20.pdf] Text
program_picst20.pdf

Download (533kB)

Abstract

Current research devoted to the Natural Language Processing problem of sentence segmentation from raw text. The focus was directed to the task of segmentation of auto-generated transcripts for videos that do not have any punctuation and segmentation. Two general approaches to solve the problem of sentence segmentation were proposed and experiments concluded on a comparison of results of pre-trained transformer-based models. Research on how different approach of solving problem affects results were carried out. As a result, the sequence labeling approach turned out to be the most suitable.

Item Type: Article
Uncontrolled Keywords: fine-tuning; natural language process; NLP; sentence segmentation component; transformer
Subjects: Статті у наукометричних базах > Scopus
Divisions: Факультети > Факультет інформаційних технологій та управління > Кафедра інформаційної та кібернетичної безпеки
Depositing User: Павло Миколайович Складанний
Date Deposited: 13 Sep 2021 07:50
Last Modified: 13 Sep 2021 07:50
URI: https://elibrary.kubg.edu.ua/id/eprint/37097

Actions (login required)

View Item View Item