Scope and Challenges in Conversational AI using Transformer Models

Arighna Chakraborty; Asoke Nath

doi:10.32628/CSEIT217696

Authors

Arighna Chakraborty Department of Computer Science, St. Xavier's College (Autonomous) Kolkata, India
Asoke Nath Department of Computer Science, St. Xavier's College (Autonomous) Kolkata, India

DOI:

https://doi.org/10.32628/CSEIT217696

Keywords:

deep learning, neural networks, recurrent neural networks, long short term memory, sequence to sequence, transformer models, switch transformer models

Abstract

Conversational AI is an interesting problem in the field of Natural Language Processing and combines natural language processing with machine learning. There has been quite a lot of advancements in this field with each new model architecture capable of processing more data, better optimisation and execution, handling more parameters and having higher accuracy and efficiency. This paper discusses various trends and advancements in the field of natural language processing and conversational AI like RNNs and RNN based architectures such as LSTMs, Sequence to Sequence models, and finally, the Transformer networks, the latest in NLP and conversational AI. The authors have given a comparison between the various models discussed in terms of efficiency/accuracy and also discussed the scope and challenges in Transformer models.

References

IBM Cloud Education,“Conversational AI”, 31st August 2020.
IBM Cloud Education,“Neural Networks”, 17th August 2020.
Simon Haykin, “Neural Networks and Learning Machines”, Pearson Education.
Denny Britz, “Recurrent Neural Network Series”, 17th September 2015.
IBM Cloud, “Recurrent Neural Networks”, 14th September 2020.
Cem Dilmegani,“In-Depth Guide to Recurrent Neural Networks (RNNs) in 2021”, 16th November 2021
Javaid Nabi, “Recurrent Neural Networks (RNNs)”, 12th July 2019.
I. Goodfellow, Y. Bengio, and A. Courville, “Deep Learning”. MIT Press, 2016.
Pranav Pillai, “Recurrent Neural Networks, the Vanishing Gradient Problem, and Long Short-Term Memory“, 17th July 2019.
Christopher Olah, “Understanding LSTM Networks”, 27th August 2015.
Cho, K.; Merrienboer, B.; Gülçehre Ç.; Bougares, F.; Schwenk, H.; Bengio, Y., “Learning Phrase Representations using RNN”. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014.
Marc Moreno Lopez, Jugal Kalita, “Deep Learning applied to NLP”, arXiv:1703.03091.
Ilya Sutskever, Oriol Vinyals, Quoc V.Le, “Sequence to Sequence Learning with Neural Networks”,10th September 2014.
Kyunghyun Cho, Bart vanMerrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio, “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation”, 3rd June 2014.
Jonas Gehring Michael Auli David Grangier Denis Yarats Yann N. Dauphin,“Convolutional Sequence to Sequence Learning”, arXiv:1705.03122, 25th July 2017
Suriyadeepan Ram, “Chatbots with Seq2Seq”, 28th June 2016.
MathsWorks.com,“Visualize Word Embeddings Using Text Scatter Plots”
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, “Attention is All You Need”, 6th Dec 2017.
Maxime, “What is a Transformer ?”, Inside Machine Learning, 4th January 2019.
Michael Phi, Illustrated Guide to Transformers- Step by Step Explanation, 1st May 2020
Buomsoo Kim, “Attention in Neural Networks”, 11th November 2020.
Harshall Lamba,“Intuitive Understanding of Attention Mechanism in Deep Learning”, Towards Data Science, 20th March 2019.
Ketan Doshi, “Transformers Explained Visually”, Towards Data Science, 17th January 2017.
Saurav Singla , Ramachandra N., “Comparative Analysis of Transformer Based Pre-Trained NLP Models”, 30th November 2020.
Xiaoyu Yin, Dagmar Gromann, Sebastian Rudolph, “Neural Machine Translating from Natural Language to SPARQL”, 21st June 2019.

Scope and Challenges in Conversational AI using Transformer Models

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite