論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
基於問答生成之資料增強 Question-Answer Generation for Data Augmentation |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
46 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2019-11-06 |
繳交日期 Date of Submission |
2019-11-13 |
關鍵字 Keywords |
資料增強、生成對抗網路、神經網路、問題生成、閱讀理解 data augmentation, neural network, generative adversarial network, question generation, reading comprehension |
||
統計 Statistics |
本論文已被瀏覽 5971 次,被下載 14 次 The thesis/dissertation has been browsed 5971 times, has been downloaded 14 times. |
中文摘要 |
本篇論文涉及自然語言處理領域中的三個問題:自然語言理解、重要內容提取以及問題生成,而我們的主旨在於建立更好的問題生成模型。現有的問題生成研究大多假設答案是已知或可獲得的,有的研究則將答案萃取以及問題生成視為兩階段任務。以往的問題生成神經網路模型多以遞歸神經網路(recurrent neural network)實現,我們採用一個基於注意力(attention)機制的問題答案生成模型,同時能夠達成答案萃取與問題生成,並且以問答模型評估生成之問答能否達到資料增強(data augmentation)的效果。實驗結果顯示在大資料集預訓練的模型較無法在轉移時受益於資料增強,但其他實驗中的問答模型以問答生成模型產生的樣本當作擴充資料,可以使該問答模型有更好的準確度。我們也實驗使用生成對抗網路(generative adversarial network)產生更多擴充資料,結果顯示基於對抗生成網路產生的樣本未能顯著提高資料增強之效能。 |
Abstract |
Question generation is a popular issue with the rise of deep neural network; however, previous works either assuming that answers for question generation are known or considering answer extraction and question generation as separated tasks. Here we propose a question-answer generation model based on attention technique. The result shows that a fine-tuned question-answer generation model gains better performance and it can be a good data augmentation method for question-answering. We also find that the generative adversarial network does not significantly improve the performance of the question-answer generation model on data augmentation. Besides, we test the performance of data augmentation in various circumstances, and we find that the model pre-trained on a large corpus does not benefit from data augmentation. |
目次 Table of Contents |
摘要 ii Abstract iii Chapter 1 Introduction 1 1.1 Motivation and Problem Description 1 1.2 Main Contribution 2 1.3 Thesis Structure 3 Chapter 2 Related Work 5 2.1 Question Generation 5 2.2 Generative Adversarial Network in Text Generative Model 6 2.3 Data Augmentation in NLP 7 Chapter 3 Background 9 3.1 Related Model 9 3.1.1 Transformer 9 3.1.2 Bidirectional Encoder Representations from Transformer 12 3.2 Algorithm 13 3.2.1 Cross-Entropy 13 3.2.3 Beam Search 14 Chapter 4 Method 15 4.1 Question-Answer Generation Model 15 4.2 Diversity-Promoting Generative Adversarial Network 16 4.3 Diverse Beam Search 19 Chapter 5 Evaluation 21 5.1 Experimental Setup 21 5.1.1 Question-Answer Generation Model 21 5.1.2 Adversarial Training 22 5.1.3 Question-Answering Model 22 5.2 Evaluation Metrics 23 5.2.1 Bilingual Evaluation Understudy 24 5.2.2 Recall-Oriented Understudy for Gisting Evaluation 24 5.2.3 Exact Match and F1 score 24 5.3 Results and Analysis 25 5.3.1 Question-Answer Generation Model 25 5.3.2 Question-Answering Model 25 5.4 Summary 30 Chapter 6 Conclusion and Future Work 33 6.1 Conclusion 33 6.2 Future Work 34 Bibliography 35 |
參考文獻 References |
[1] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” ArXiv160605250 Cs, Jun. 2016. [2] A. Trischler et al., “NewsQA: A Machine Comprehension Dataset,” ArXiv161109830 Cs, Nov. 2016. [3] M. Joshi, E. Choi, D. S. Weld, and L. Zettlemoyer, “TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension,” ArXiv170503551 Cs, May 2017. [4] M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, “Bidirectional Attention Flow for Machine Comprehension,” ArXiv161101603 Cs, Nov. 2016. [5] C. Clark and M. Gardner, “Simple and Effective Multi-Paragraph Reading Comprehension,” ArXiv171010723 Cs, Oct. 2017. [6] A. W. Yu et al., “QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension,” ArXiv180409541 Cs, Apr. 2018. [7] W. Zhong, D. Tang, N. Duan, M. Zhou, J. Wang, and J. Yin, “Improving Question Answering by Commonsense-Based Pre-Training,” ArXiv180903568 Cs, Sep. 2018. [8] Q. Zhou, N. Yang, F. Wei, C. Tan, H. Bao, and M. Zhou, “Neural Question Generation from Text: A Preliminary Study,” ArXiv170401792 Cs, Apr. 2017. [9] X. Yuan et al., “Machine Comprehension by Text-to-Text Neural Question Generation,” ArXiv170502012 Cs, May 2017. [10] Q. Xie, Z. Dai, E. Hovy, M.-T. Luong, and Q. V. Le, “Unsupervised Data Augmentation for Consistency Training,” ArXiv190412848 Cs Stat, Apr. 2019. [11] J. Xu, X. Ren, J. Lin, and X. Sun, “DP-GAN: Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text,” ArXiv180201345 Cs, Feb. 2018. [12] D. Tang, N. Duan, T. Qin, Z. Yan, and M. Zhou, “Question Answering and Question Generation as Dual Tasks,” ArXiv170602027 Cs, Jun. 2017. [13] P. Mannem, R. Prasad, and A. Joshi, Question Generation from Paragraphs at UPenn: QGSTEC System Description. 2010. [14] M. Heilman, “Automatic Factual Question Generation from Text,” 2011. [15] G. Kumar, R. Banchs, and L. F. D’Haro, “RevUP: Automatic Gap-Fill Question Generation from Educational Texts,” in Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, Denver, Colorado, 2015, pp. 154–161. [16] Y. Chali and S. A. Hasan, “Towards Topic-to-Question Generation,” Comput. Linguist., vol. 41, no. 1, pp. 1–20, Mar. 2015. [17] V. Harrison and M. Walker, “Neural Generation of Diverse Questions using Answer Focus, Contextual and Linguistic Features,” ArXiv180902637 Cs, Oct. 2018. [18] X. Du, J. Shao, and C. Cardie, “Learning to Ask: Neural Question Generation for Reading Comprehension,” ArXiv170500106 Cs, Apr. 2017. [19] V. Kumar, G. Ramakrishnan, and Y.-F. Li, “Putting the Horse before the Cart: A Generator-Evaluator Framework for Question Generation from Text,” in Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Hong Kong, China, 2019, pp. 812–821. [20] Y. Kim, H. Lee, J. Shin, and K. Jung, “Improving Neural Question Generation Using Answer Separation,” Proc. AAAI Conf. Artif. Intell., vol. 33, pp. 6602–6609, Jul. 2019. [21] Z. Yang, J. Hu, R. Salakhutdinov, and W. W. Cohen, “Semi-Supervised QA with Generative Domain-Adaptive Nets,” ArXiv170202206 Cs, Feb. 2017. [22] S. Subramanian, T. Wang, X. Yuan, S. Zhang, Y. Bengio, and A. Trischler, “Neural Models for Key Phrase Detection and Question Generation,” ArXiv170604560 Cs, Jun. 2017. [23] X. Du and C. Cardie, “Identifying Where to Focus in Reading Comprehension for Neural Question Generation,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 2067–2073. [24] J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, and D. Jurafsky, “Adversarial Learning for Neural Dialogue Generation,” ArXiv170106547 Cs, Jan. 2017. [25] Z. Yang, W. Chen, F. Wang, and B. Xu, “Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets,” Mar. 2017. [26] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to Sequence Learning with Neural Networks,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 3104–3112. [27] S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer, “Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks,” ArXiv150603099 Cs, Jun. 2015. [28] J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, “Long Text Generation via Adversarial Training with Leaked Information,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [29] I. Goodfellow et al., “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 2672–2680. [30] L. Yu, W. Zhang, J. Wang, and Y. Yu, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient,” ArXiv160905473 Cs, Sep. 2016. [31] Y. Zhang et al., “Adversarial Feature Matching for Text Generation,” in Proceedings of the 34th International Conference on Machine Learning - Volume 70, Sydney, NSW, Australia, 2017, pp. 4006–4015. [32] W. Fedus, I. Goodfellow, and A. M. Dai, “MaskGAN: Better Text Generation via Filling in the______,” ArXiv180107736 Cs Stat, Jan. 2018. [33] R. Sennrich, B. Haddow, and A. Birch, “Improving Neural Machine Translation Models with Monolingual Data,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 2016, pp. 86–96. [34] Z. Xie et al., “Data Noising as Smoothing in Neural Network Language Models,” ArXiv170302573 Cs, Mar. 2017. [35] X. Wu, S. Lv, L. Zang, J. Han, and S. Hu, “Conditional BERT Contextual Augmentation,” ArXiv181206705 Cs, Dec. 2018. [36] W. Y. Wang and D. Yang, “That’s So Annoying‼!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 2557–2563. [37] Z. Hu, Z. Yang, X. Liang, R. Salakhutdinov, and E. P. Xing, “Toward Controlled Generation of Text,” ArXiv170300955 Cs Stat, Mar. 2017. [38] A. Vaswani et al., “Attention is All you Need,” in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc., 2017, pp. 5998–6008. [39] J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer Normalization,” ArXiv160706450 Cs Stat, Jul. 2016. [40] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” ArXiv151203385 Cs, Dec. 2015. [41] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” ArXiv181004805 Cs, Oct. 2018. [42] R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Mach. Learn., vol. 8, no. 3, pp. 229–256, May 1992. [43] A. K. Vijayakumar et al., “Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models,” ArXiv161002424 Cs, Oct. 2016. [44] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a Method for Automatic Evaluation of Machine Translation,” in Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 2002, pp. 311–318. [45] C.-Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries,” in Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 74–81. |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |