پایایی نمرات ترجمه‌های انسانی با استفاده از ابزارهای خودکار جانشین ارزیابی کیفیت ترجمه

کرمی, سمیه; نژادانصاری, داریوش; حسابی, اکبر

doi:10.22059/jflr.2020.309025.751

فهرست نشریات

دوره ویراستاری

دوره آموزشی بهره‌گیری کاربردی از هوش مصنوعی در نگارش، داوری، ویرایش و انتشار آثار علمی برگزار می‌شود

دوره تخصصی آموزش ویراستاری برگزار می‌شود.

فهرست نشریات دارای اعتبار وزارت علوم، تحقیقات و فناوری

فهرست مجلات علمی- پژوهشی دانشگاه تهران

نحوه ارسال مقاله برای مجله- ثبت نام در سامانه- فراموش کردن رمز عبور

تعداد نشریات	127
تعداد شماره‌ها	7,176
تعداد مقالات	77,113
تعداد مشاهده مقاله	156,362,983
تعداد دریافت فایل اصل مقاله	117,923,853

	پایایی نمرات ترجمه‌های انسانی با استفاده از ابزارهای خودکار جانشین ارزیابی کیفیت ترجمه
پژوهشهای زبانشناختی در زبانهای خارجی
دوره 10، شماره 3، آبان 1399، صفحه 618-629 اصل مقاله (1.08 M)
نوع مقاله: علمی پژوهشی(عادی)
شناسه دیجیتال (DOI): 10.22059/jflr.2020.309025.751
نویسندگان
سمیه کرمی¹؛ داریوش نژادانصاری^* ²؛ اکبر حسابی³
¹دانشجوی دکتری رشته ترجمه، دانشگاه اصفهان، اصفهان، ایران
²استادیار گروه زبان و ادبیات انگلیسی، رشته آموزش زبان انگلیسی، دانشگاه اصفهان، اصفهان، ایران
³استادیار گروه زبان و ادبیات انگلیسی، رشته زبان‌شناسی همگانی، دانشگاه اصفهان، اصفهان، ایران
چکیده
با توجه به ماهیت فرایند ارزیابی ترجمه که از لحاظ زمان، انرژی و هزینه قابل تامل می‌باشد، بهره‌گیری از فن‌آوری‌های نوین در حوزه‌ ترجمه ماشینی منطقی به نظر می‌رسد. ابزارهای خودکار جانشین ارزیابی کیفیت ترجمه یکی از این فن‌آوری‌ها است که در حوزه ترجمه ماشینی کاربرد دارد. این پژوهش در صدد یافتن پاسخ این سؤال است که پایایی نمرات این ابزارها در سطح واژگان به ترجمه‌های انسانی (۵۱ دانشجوی سال آخر رشته ترجمه در ایران) با استفاده از ۱، ۲، ... تا ۵ ترجمه معیار به صورت مرحله به مرحله و جداگانه چه تغییری می‌کند. لذا پژوهشی تجربی و کاربردی با رویکردی کمی برای محاسبه میزان پایایی نمرات این ابزارها در مقایسه با میانگین نمرات ۵ ارزیاب‌ متخصص انجام شد. میزان رابطه همبستگی میان این دو مجموعه نمره (در حالت‌های مختلف استفاده از ۱، ۲، ... تا ۵ ترجمه معیار) به منزله پایایی نمرات ابزار خودکار تفسیر شده است. نتایج تحلیل آزمون همبستگی پیرسون نشان داد که استفاده از ۵ ترجمه معیار در ۸۰/۳۷ درصد موارد منجر به بالاترین میزان رابطه همبستگی شده است که بیشتر از هر حالت دیگر در این پژوهش است (۴ ترجمه معیار (۶۵/۳درصد)، ۳ ترجمه معیار (۱۰.۹۷ درصد)، ۲ ترجمه معیار (۷۰/۳۱ درصد) و۱ ترجمه معیار (۸۵/۱۵ درصد)). بنابراین، فرضیه پژوهش تایید می‌شود که استفاده از ترجمه‌های معیار بیشتر منجر به رابطه همبستگی بالاتر و پایایی بیشتر نمرات می‌شود. در عین حال، استفاده از ۲ ترجمه معیار جایگاه دوم را از نظر دستیابی به بالاترین میزان رابطه همبستگی دارد و فرضیه پژوهش را نقض می‌کند.
کلیدواژه‌ها
ارزیابی کیفیت ترجمه؛ ابزارهای خودکار جانشین ارزیابی کیفیت ترجمه؛ نمره‌دهی خودکار؛ پایایی؛ ترجمه معیار
عنوان مقاله [English]
Reliability of Human Translations’ Scores Using Automated Translation Quality Evaluation Understudy Metrics
نویسندگان [English]
Somayyeh Karami¹؛ Dariush Nejadansari²؛ Akbar Hesabi³
¹PhD Candidate in Translation, University of Isfahan, Isfahan, Iran

³Faculty of foreign languages, University of Isfahan
چکیده [English]
Considering the costly nature of translation quality assessment in terms of time, money and energy, it seems logical to benefit from the modern technologies that are introduced in the field of machine translation (MT). Automated Translation Quality Evaluation Understudy Metrics (ATQEUMs) are one of these technologies that have revealed a promising application in assessing the MT output quality. This study, however, attempts to examine the reliability of the scores provided by the lexical ATQEUMs to human translated texts (i.e. the ones provided by 51 senior students of translator training programs in Iran) using 1, 2, …, 5 reference translations successively and separately. To this end, an empirical applied study is conducted following a quantitative approach to assess the reliability of the lexical ATQEUMs’ scores in comparison to the expert scorers’ scores. The higher the correlation between the sets of scores (in different stages of using 1, 2, …, 5 reference translations), the higher the reliability is interpreted to be. The results of the Pearson correlation coefficient analysis revealed that using 5 reference translations had led to the highest correlations in 37.80% of cases, which is more than the number for any other situation considered (i.e. using 4 reference translations (3.65%), 3 reference translations (10.97%), 2 reference translations (31.70%), and 1 reference translation (15.85%)). However, using 2 reference translations achieved the second position in having the highest correlations which contradicted the hypothesis that more reference translations would lead to higher correlations and reliability.
کلیدواژه‌ها [English]
Translation quality assessment, Lexical automated translation quality evaluation understudy metrics, Automated scoring, Reliability, Reference translations
سایر فایل های مرتبط با مقاله 11e.pdf
مراجع
Banerjee, S., & Lavie, A. (2005). METEOR: An Automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization. https://www.aclweb.org/anthology/W05-0909.pdf Beikmohammadi, Maryam, Alavi, Seyyed-Mohammad, Kaivanpanah, Shiva (2020). Learning-oriented Assessment of Reading: A Mixed Methods Study of Iranian EFL University Instructors’ Perceptions and Practices. Journal of Foreign Language Research, 10 (2), 316-329. https://jflr.ut.ac.ir/article_77098_en.html Bowker, L. (2001). Towards a methodology for a corpus-based approach to translation evaluation. Meta: Translators' journal, 46(2). pp. 345-364. https://www.erudit.org/fr/revues/meta/2001-v46-n2-meta159/002135ar.pdf Chi, M. T. (2006). Two approaches to the study of experts' characteristics. In K. A. Ericsson, N. Charness, P. Feltovich, & R. Hoffman, The Cambridge handbook of expertise and expert performance, (pp. 21-29). https://learnlab.org/uploads/mypslc/publications/chi%20two%20approaches%20chapter%202006.pdf Doddington, G. (2002). Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. Proceedings of the Second International Conference on Human Language Technology, (pp. 138-145). https://dl.acm.org/doi/10.5555/1289189.1289273 Gonzàlez, M., & Giménez, J. (2014). Asiya: An open toolkit for automatic machine translation (meta-)evaluation, Technical Manual, Version 3.0. Retrieved from TALP Research Center Project Management. http://asiya.lsi.upc.edu/Asiya_technical_manual_v3.0.pdf Hoffman, R., Ward, P., Feltovich, P. J., Dibello, L., Fiore, S. M., & Andrews, D. H. (2014). Accelerated expertise, training for high proficiency in a complex world. New York: Taylor & Francis. http://gen.lib.rus.ec/book/index.php?md5=8A76C54D4A31FC7377F3C9E9AB6882C3 House, J. (1997). Quality of translation. In: M. Baker, ed., The Routledge encyclopedia of translation studies (pp. 197-200). London and New York: Routledge. https://www.routledge.com/Routledge-Encyclopedia-of-Translation-Studies/Baker-Saldanha/p/book/9781138933330 Kiraly, D. (2000). A social constructivist approach to translator education, empowerment from theory to practice. London and New York: St. Jerome Publishing. https://www.goodreads.com/book/show/3842881-a-social-constructivist-approach-to-translator-education Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 8(10), 707–710. https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf Lin, C.-Y., & Och, F. J. (2004). Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics. https://www.aclweb.org/anthology/P04-1077/ Màrquez, L. (2013). Automatic evaluation of machine translation quality. Invited talk at Dialogue 2013. Bekasovo Resort, Russia: TALP Research Center, Technical University of Catalonia (UPC). http://ufal.mff.cuni.cz/pbml/94/art-gimenez-marques-evaluation.pdf Melamed, I. D., Green, R., & Turian, J. (2003). Precision and recall of machine translation. Proceedings of the Joint Conference on Human Language Technology and the North American Chapter of the Association for Computational Linguistics. https://www.aclweb.org/anthology/N03-2021/ Nießen, S., Och, F. J., Leusch, G., & Ney, H. (2000). An evaluation tool for machine translation: Fast evaluation for MT research. Proceedings of the 2nd International Conference on Language Resources and Evaluation. http://www.lrec-conf.org/proceedings/lrec2000/pdf/278.pdf Olohan, M. (2004). Introducing corpora in translation studies. New York: Routledge. https://www.taylorfrancis.com/books/9780203640005 Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). BLEU: A method for automatic evaluation of machine translation. Proceedings of the 40^th Annual Meeting of the Association for Computational Linguistics, (pp. 311-318). Philadelphia. https://www.aclweb.org/anthology/P02-1040.pdf Saldanha, G., & O'Brien, S. (2014). Research methodologies in translation studies. London and New York: Routledge, Taylor and Francis Group. http://gen.lib.rus.ec/book/index.php?md5=7a1453cac7fc114e796bd75be079006a Snover, M., Dorr, B., Schwartz, R., Micciulla, L., & Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, (pp. 223–231). https://www.cs.umd.edu/~snover/pub/amta06/ter_amta.pdf Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., & Sawaf, H. (1997). Accelerated DP based search for statistical translation. Proceedings of European Conference on Speech Communication and Technology. https://www-i6.informatik.rwth-aachen.de/publications/download/203/TillmannC.VogelS.NeyH.SawafH.ZubiagaA.--AcceleratedDP-basedSearchforStatisticalTranslation--1997.pdf Weigle, S. C. (2011). Validation of automated scores of TOEFL iBT® tasks against nontest indicators of writing ability. TOEFL iBT® Research Report. ETS, Georgia State University, Atlanta. https://onlinelibrary.wiley.com/doi/epdf/10.1002/j.2333-8504.2011.tb02260.x
آمار تعداد مشاهده مقاله: 1,349 تعداد دریافت فایل اصل مقاله: 1,097

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

پیوندهای مفید

اخبار و اعلانات

آمار

پایایی نمرات ترجمه‌های انسانی با استفاده از ابزارهای خودکار جانشین ارزیابی کیفیت ترجمه