Improved Text Classification using Long Short-Term Memory and Word Embedding Technique

AUTHORS

Amol C. Adamuthe,Assistant Professor, Rajarambapu Institute of Technology, Rajaramnagar, MS, India

ABSTRACT

Text classification is an important problem for spam filtering, sentiment analysis, news filtering, document organizations, document retrieval and many more. The complexity of text classification increases with a number of classes and training samples. The main objective of this paper is to improve the accuracy of text classification with long short-term memory with word embedding. Experiments conducted on seven benchmark datasets namely IMDB, Amazon review full score, Amazon review polarity, Yelp review polarity, AG news topic classification, Yahoo! Answers topic classification, DBpedia ontology classification with different number of classes and training samples. Different experiments are conducted to evaluate the effect of each parameter on LSTM. Results show that 100 batch size, 50 epochs, Adagrad optimizer, 5 hidden nodes, 100-word vector length, 2 LSTM layers, 0.001 L2 regularization, 0.001 learning rate give the higher accuracy. The results of LSTM are compared with literature. For IMDB, Amazon review full score, Yahoo! Answers topic classification dataset the results obtained are better than literature. Results of LSTM for Amazon review polarity, Yelp review polarity, AG news topic classification are close to best-known results. For DBpedia ontology classification dataset the accuracy is more than 91% but less than best known.

 

KEYWORDS

Text classification, Long short-term memory, Deep learning, Word2Vec, Word embedding

REFERENCES

[1]     E. Tellez and D. Moctezuma, “An automated text categorization framework based on hyperparameter optimization,” Knowledge-Based Systems, pp.110-123, (2018) DOI: 10.1016/j.knosys.2018.03.003(CrossRef)(Google Scholar)
[2]     B. Parlak and A. Uysal, “The impact of feature selection on medical document classification,” in proceedings of the 11th Iberian Conference on Information Systems and Technologies (CISTI), Turkey, pp.1-5, (2016) DOI: 10.1109/CISTI.2016.7521524(CrossRef)(Google Scholar)
[3]     P. Pawar and S. Gawande, “A comparative study on different types of approaches to text categorization,” International Journal of Machine Learning and Computing, vol.2, no.4, pp. 423-426, (2012)
[4]     Y. Zhang, R. Venkatesan, N. Wang and M. Pratama, “Sentiment classification using comprehensive attention recurrent models,” in proceedings of the International Joint Conference on Neural Networks (IJCNN), China, pp.1562-1569, (2016) DOI: 10.1109/IJCNN.2016.7727384(CrossRef)(Google Scholar)
[5]     R. Rahul, K. Kumar and S. Selvakumar, “Opinion and topic detection using sentiment classifier,” International Journal of Engineering and Computer Science, vol.2, no.7, pp.2189-2194, (2013)
[6]     A. Mohsen, N. El-Makky and Ghanem. Nagia, “Author identification using deep learning,” in proceedings of the 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Egypt, pp.898-903, (2016) DOI: 10.1109/ICMLA.2016.0161(CrossRef)(Google Scholar)
[7]     S. Malmasi and D. Mark, “Language identification using classifier ensembles,” in proceedings of the joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects, pp.35-43, (2015)
[8]     N. Daniel and K. Karthik, “Survey on Text Classification Methods,” International Journal of Advanced Research in Computer Science and Software Engineering, vol.6, no.2, pp.585-588, (2016)
[9]     S. Chakrabarti, B. Dom, R. Agrawal and P. Raghavan, “Using taxonomy, discriminants, and signatures for navigating in text databases,” in preceding of the 23rd VLDB Conference Athens, Greece, vol.97, pp.446-455, (1997)
[10]  B. Pang and L. Lee, “A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts,” in proceedings of the 42nd annual meeting on Association for Computational Linguistics, pp.271-279, (2004) DOI: 10.3115/1218955.1218990(CrossRef)(Google Scholar)
[11]  L. Deng, “A tutorial survey of architectures, algorithms, and applications for deep learning,” in proceedings of the APSIPA Transactions on Signal and Information Processing, vol.3, no.2, pp.1-29, (2014) DOI: 10.1017/atsip.2013.9(CrossRef)(Google Scholar)
[12]  I. Rish, “An empirical study of the naive Bayes classifier,” in IJCAI 2001 workshop on empirical methods in artificial intelligence, New York, vol.3, no.22, pp.41-46, (2001)
[13]  V. Tam, A. Santoso, and R. Setiono, “A comparative study of centroid-based, neighborhood-based and statistical approaches for effective document categorization,” in proceedings of the IEEE 16th International Conference on Pattern Recognition, Hong Kong, vol.4, pp.235-238, (2002)
[14]  N. Ranjan, Y. Ghorpade, G. Kanthale, A. Ghorpade and A. Dubey, “Document classification using LSTM neural network,” Journal of Data Mining and Management, vol.2, no.2, (2017)
[15]  W. Cohen and Y. Singer, “Context-sensitive learning methods for text categorization,” ACM Transactions on Information Systems (TOIS), vol.17, no.2, pp.141-173, (1999)
[16]  A. Mashat, M. Fouad, S. Philip, and T. Gharib, “A decision tree classification model for university admission system,” International Journal of Advanced Computer Science and Applications, vol.3, no.10, pp.17-21, (2012)
[17]  T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in proceedings of the European conference on machine learning, Berlin, pp.137-142, (1998)
[18]  P. Ciarelli, E. Oliveira, C. Badue, and A. De Souza, “Multi-label text categorization using a probabilistic neural network,” International Journal of Computer Information Systems and Industrial Management Applications, vol.1, pp.133-144, (2009)
[19]  D. Miao, Q. Duan, H. Zhang, and N. Jiao, “Rough set based hybrid algorithm for text classification,” Expert Systems with Applications, vol.36, no.5, pp.9168-9174, (2009)
[20]  S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol.9, no.8, pp.1735-1780, (1997)
[21]  Q. Ain, M. Ali, A. Riaz, A. Noureen, M. Kamran, B. Hayat, and A. Rehman, “Sentiment analysis using deep learning techniques: a review,” International Journal of Advanced Computer Science and Applications, vol.8, no.6, pp.424-433, (2017)
[22]  P. Vateekul and T. Koomsubha, “A study of sentiment analysis using deep learning techniques on Thai Twitter data,” in proceedings of the IEEE 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), Thailand, pp.1-6, (2016)
[23]  D. Li and J. Qian, “Text sentiment analysis based on long short-term memory,” in proceedings of the IEEE International Conference on Computer Communication and the Internet, China, pp.471-475, (2016)
[24]  J. Hong and M. Fang, “Sentiment analysis with deeply learned distributed representations of variable length texts,” Technical report, pp.655-665, (2015)
[25]  K. Baktha and B. K. Tripathy, “Investigation of recurrent neural networks in the field of sentiment analysis,” in proceedings of the IEEE Conference on Communication and Signal Processing, India, April, pp.2047-2050, (2017)
[26]  A. Hassan and A. Mahmood, “Deep learning for sentence classification,” in proceedings of the IEEE Conference on Systems, Applications and Technology, Long Island, USA, pp.1-5, (2017)
[27]  P. Semberecki and H. Maciejewski, “Deep learning methods for subject text classification of articles,” in proceedings of the IEEE Federated Conference on Computer Science and Information Systems, Poland, vol.11, pp.357-360, (2017)
[28]  T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:(1301)3781, (2013)
[29]  M. Farhoodi and A. Yari, “Applying machine learning algorithms for automatic Persian text classification,” in proceedings of the 6th International Conference on Advanced Information Management and Service, Iran, pp.318-323, (2010)
[30]  C. Leke, B. Twala and T. Marwala, “Missing data prediction and classification: The use of auto-associative neural networks and optimization algorithms,” arXiv preprint arXiv:(1403)5488, (2014)
[31]  B. Chen, L. Xing, N. Zheng and C. Principe, “Quantized Minimum Error Entropy Criterion,” arXiv preprint arXiv:(1710)04089, (2017)
[32]  https://drive.google.com/drive/u/0/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M
[33]  https://deeplearning4j.org/
[34]  R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, A. Ng and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in proceedings of the conference on empirical methods in natural language processing, Seattle, Washington, USA, pp.1631-1642, (2013)
[35]  Maas A., Daly R., Pham P., Huang D., Ng A., and Potts C., “Learning word vectors for sentiment analysis,” in proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, vol.1, pp.142-150, (2011)
[36]  Y. Long, L. Qin, R. Xiang, M. Li, and C. Huang, “A cognition-based attention model for sentiment analysis,” in proceedings of the Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp.462-471, (2017)
[37]  Y. Wang and F. Tian, “Recurrent residual learning for sequence classification,” in proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp.938-943, (2016)
[38]  Y. Wen, W. Zhang, R. Luo, and J. Wang, “Learning text representation using recurrent convolutional neural network with highway layers,” arXiv preprint arXiv:(1606)06905, Pisa, Italy, (2016)
[39]  S. Wang and D. Christopher, “Baselines and bigrams: Simple, good sentiment and topic classification,” in proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Stanford, vol.2, pp.90-94, (2012)
[40]  R. Johnson and T. Zhang, “Supervised and semi-supervised text categorization using LSTM for region embeddings,” arXiv preprint arXiv:(1602), 02373, (2016)
[41]  Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in proceedings of the International Conference on Machine Learning, Beijing, China, vol.32, pp.1188-1196, (2014)
[42]  A. Dieng, C. Wang, J. Gao, and J. Paisley, “Topicrnn: A recurrent neural network with long-range semantic dependency,” arXiv preprint arXiv:1611, 01702, (2016)
[43]  Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical attention networks for document classification,” in proceedings of the North American Chapter on Association for Computational Linguistics: Human Language Technologies, Redmond, pp.1480-1489, (2016)
[44]  D. Yogatama, C. Dyer, W. Ling, and P. Blunsom, “Generative and discriminative text classification with recurrent neural networks,” arXiv preprint arXiv:(1703)01898, (2017)
[45]  C. Qiao, B. Huang, G. Niu, D. Li, D. Dong, W. He, and H. Wu, “A new method of region embedding for text classification,” National Engineering Laboratory of Deep Learning Technology and Application, China, (2018)
[46]  A. Conneau, H. Schwenk, L. Barrault, and Y. Lecun, “Very deep convolutional networks for text classification,” in proceedings of the 15th Conference on European Chapter of the Association for Computational Linguistics, France, vol.1, pp.1107-1116, (2016)
[47]  Z. Huang, Z. Ye, S. Li, and R. Pan, “Length adaptive recurrent model for text classification,” in proceedings of the ACM on Information and Knowledge Management, Singapore, pp.1019-1027, (2017)

CITATION

  • APA:
    Adamuthe,A.C.(2020). Improved Text Classification using Long Short-Term Memory and Word Embedding Technique. International Journal of Hybrid Information Technology, 13(1), 19-32. 10.21742/IJHIT.2020.13.1.03
  • Harvard:
    Adamuthe,A.C.(2020). "Improved Text Classification using Long Short-Term Memory and Word Embedding Technique". International Journal of Hybrid Information Technology, 13(1), pp.19-32. doi:10.21742/IJHIT.2020.13.1.03
  • IEEE:
    [1] A.C.Adamuthe, "Improved Text Classification using Long Short-Term Memory and Word Embedding Technique". International Journal of Hybrid Information Technology, vol.13, no.1, pp.19-32, Mar. 2020
  • MLA:
    Adamuthe Amol C.. "Improved Text Classification using Long Short-Term Memory and Word Embedding Technique". International Journal of Hybrid Information Technology, vol.13, no.1, Mar. 2020, pp.19-32, doi:10.21742/IJHIT.2020.13.1.03
 

COPYRIGHT

Creative Commons License
© 2020 Amol C. Adamuthe. Published by Global Vision Press. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CCBY4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

ISSUE INFO

  • Volume 13, No. 1, 2020
  • ISSN(p):1738-9968
  • ISSN(e):2652-2233
  • Published:Mar. 2020

DOWNLOAD