KO‘P TILLI PARALLEL KORPUS ARXITEKTURASI VA GAPLARNI AVTOMATIK MOSLASHTIRISH ALGORITMI
DOI:
https://doi.org/10.47390/ydif-y2026v2i8/n02Kalit so‘zlar:
parallel korpus, ko‘p tillilik, arxitektura, gaplarni moslashtirish, NLP, mashina tarjimasi.Annotasiya
Mazkur maqolada ko‘p tilli parallel korpuslarni yaratish, saqlash va ulardan samarali foydalanishga mo‘ljallangan moslashuvchan arxitektura hamda gaplarni avtomatik moslashtirish algoritmi taklif etiladi. Taklif etilgan arxitektura asar, gap va so‘z sathlaridan iborat uch bosqichli ierarxik model asosida qurilgan. Parallel gaplarni aniqlash uchun uzunlik o‘xshashligi, tartib yaqinligi va mazmuniy o‘xshashlik mezonlariga asoslangan gibrid algoritm ishlab chiqildi.
Manbalar
1. Brown, P. F., Lai, J. C., Mercer, R. L. Aligning sentences in parallel corpora // Proceedings of ACL, 1991. https://doi.org/10.3115/981344.981366
2. Artetxe, M., Schwenk, H. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond // Transactions of ACL, 2019. https://doi.org/10.1162/tacl_a_00288
3. Gale, W. A., Church, K. W. A program for aligning sentences in bilingual corpora Computational Linguistics. 1993. Vol. 19(1). P. 75–102.
4. Conneau, A. et al. Unsupervised cross-lingual representation learning at scale // ACL, 2020. https://doi.org/10.18653/v1/2020.acl-main.747
5. Feng, F. et al. Language-agnostic BERT sentence embedding // ACL Findings, 2022. https://doi.org/10.18653/v1/2022.acl-long.62
6. Och, F. J., Ney, H. A systematic comparison of various statistical alignment models // Computational Linguistics. 2003. Vol. 29(1). P. 19–51. https://doi.org/10.1162/089120103321337421
7. Reimers, N., Gurevych, I. Sentence-BERT: Sentence embeddings using Siamese BERT-networks // EMNLP, 2019. https://doi.org/10.18653/v1/D19-1410
This work is licensed under a