KO‘P TILLI PARALLEL KORPUS ARXITEKTURASI VA GAPLARNI AVTOMATIK MOSLASHTIRISH ALGORITMI
DOI:
https://doi.org/10.47390/ydif-y2026v2i8/n02Ключевые слова:
parallel korpus, ko‘p tillilik, arxitektura, gaplarni moslashtirish, NLP, mashina tarjimasi.Аннотация
Mazkur maqolada ko‘p tilli parallel korpuslarni yaratish, saqlash va ulardan samarali foydalanishga mo‘ljallangan moslashuvchan arxitektura hamda gaplarni avtomatik moslashtirish algoritmi taklif etiladi. Taklif etilgan arxitektura asar, gap va so‘z sathlaridan iborat uch bosqichli ierarxik model asosida qurilgan. Parallel gaplarni aniqlash uchun uzunlik o‘xshashligi, tartib yaqinligi va mazmuniy o‘xshashlik mezonlariga asoslangan gibrid algoritm ishlab chiqildi.
Библиографические ссылки
1. Brown, P. F., Lai, J. C., Mercer, R. L. Aligning sentences in parallel corpora // Proceedings of ACL, 1991. https://doi.org/10.3115/981344.981366
2. Artetxe, M., Schwenk, H. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond // Transactions of ACL, 2019. https://doi.org/10.1162/tacl_a_00288
3. Gale, W. A., Church, K. W. A program for aligning sentences in bilingual corpora Computational Linguistics. 1993. Vol. 19(1). P. 75–102.
4. Conneau, A. et al. Unsupervised cross-lingual representation learning at scale // ACL, 2020. https://doi.org/10.18653/v1/2020.acl-main.747
5. Feng, F. et al. Language-agnostic BERT sentence embedding // ACL Findings, 2022. https://doi.org/10.18653/v1/2022.acl-long.62
6. Och, F. J., Ney, H. A systematic comparison of various statistical alignment models // Computational Linguistics. 2003. Vol. 29(1). P. 19–51. https://doi.org/10.1162/089120103321337421
7. Reimers, N., Gurevych, I. Sentence-BERT: Sentence embeddings using Siamese BERT-networks // EMNLP, 2019. https://doi.org/10.18653/v1/D19-1410
Загрузки
Опубликован
Выпуск
Раздел
Лицензия

Это произведение доступно по лицензии Creative Commons «Attribution» («Атрибуция») 4.0 Всемирная.
This work is licensed under a