Knowledge Graph Analysis On English Wikipedia Pages Using A Deep Learning Algorithm

Authors

  • Yudistira Bagus Pratama Universitas Muhammadiyah Bangka Belitung
  • Haiyudi Haiyudi Universitas Muhammadiyah Bangka Belitung

DOI:

https://doi.org/10.25299/itjrd.2023.13459

Keywords:

Deep Learning, Data Mining, Community Detection, Wikipedia, Knowledge Graph

Abstract

Analysis of social networks or online communities can be very difficult when working on large networks, as many measurements require expensive hardware. For example, identifying the community structure of a network is a very computationally expensive task. Embedded graph is a way to represent graphs with vectors, so that further analysis becomes easier. The purpose of this research is to analyze the knowledge graph from the wikipedia article data. This research aims to implement web scraping techniques on the wikipedia article search engine and display similar wikipedia pages and analyze them using a predetermined deep learning algorithm. Data collection in this research used scraping techniques to retrieve data from the unstructured wikipedia website and then processed it into structured data. The method used in this research is a standard cross-industry process for data mining by performing phases of data collection, data processing, proposed algorithms, testing and evaluation. The algorithm applied is deepwalk, kmeans, girvan newman. By doing this research, it is expected to provide knowledge about the deep learning approach for data representation of the wikipedia pages knowledge graph and can help users find similar wikipedia pages and enrich literacy on knowledge graph analysis.

Downloads

Download data is not yet available.

References

S. Peng et al., “A survey on deep learning for textual emotion analysis in social networks,” Digit. Commun. Networks, vol. 8, no. 5, pp. 745–762, Oct. 2022, doi: 10.1016/J.DCAN.2021.10.003.

F. Karimi, S. Lotfi, and H. Izadkhah, “Multiplex community detection in complex networks using an evolutionary approach,” Expert Syst. Appl., vol. 146, p. 113184, May 2020, doi: 10.1016/j.eswa.2020.113184.

X. Li, G. Xu, L. Jiao, Y. Zhou, and W. Yu, “Multi-layer network community detection model based on attributes and social interaction intensity,” Comput. Electr. Eng., vol. 77, pp. 300–313, Jul. 2019, doi: 10.1016/j.compeleceng.2019.06.010.

J. Zhou et al., “Graph neural networks: A review of methods and applications,” AI Open, vol. 1, pp. 57–81, Jan. 2020, doi: 10.1016/J.AIOPEN.2021.01.001.

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A comprehensive survey on graph neural networks,” IEEE Trans. neural networks Learn. Syst., vol. 32, no. 1, pp. 4–24, 2020.

I. Chami, S. Abu-El-Haija, B. Perozzi, C. Ré, and K. Murphy, “Machine Learning on Graphs: A Model and Comprehensive Taxonomy,” J. Mach. Learn. Res., vol. 23, 2022.

D. Matsunaga, T. Suzumura, and T. Takahashi, “Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window Analysis,” Sep. 2019, Accessed: Jun. 26, 2023. [Online]. Available: http://arxiv.org/abs/1909.10660

B. P. Adedeji and G. Kabir, “A feedforward deep neural network for predicting the state-of-charge of lithium-ion battery in electric vehicles,” Decis. Anal. J., vol. 8, p. 100255, Sep. 2023, doi: 10.1016/J.DAJOUR.2023.100255.

S. Alaie and S. J. Al’Aref, “Application of deep neural networks for inferring pressure in polymeric acoustic transponders/sensors,” Mach. Learn. with Appl., p. 100477, Jun. 2023, doi: 10.1016/J.MLWA.2023.100477.

M. Xu, “Understanding Graph Embedding Methods and Their Applications,” SIAM Rev., vol. 63, no. 4, pp. 825–853, 2021, doi: 10.1137/20M1386062.

R. FIRMANSYAH, “IMPLEMENTASI DEEP LEARNING MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK UNTUK KLASIFIKASI BUNGA,” Pap. Knowl. . Towar. a Media Hist. Doc., vol. 3, no. 2, p. 6, 2021.

S. R. Dewi, “Deep Learning Object Detection Pada Video,” Deep Learn. Object Detect. Pada Video Menggunakan Tensorflow Dan Convolutional Neural Netw., pp. 1–60, 2018, [Online]. Available: https://dspace.uii.ac.id/bitstream/handle/123456789/7762/14611242_Syarifah Rosita Dewi_Statistika.pdf?sequence=1

P. Rodríguez, E. M. Thesis, and V. Arias, “Graph Neural Networks and its applications Master in Innovation and Research in Informatics,” 2019.

X. Li, L. Sun, M. Ling, and Y. Peng, “A survey of graph neural network based recommendation in social networks,” Neurocomputing, vol. 549, p. 126441, Sep. 2023, doi: 10.1016/J.NEUCOM.2023.126441.

P. Shao, J. He, G. Li, D. Zhang, and J. Tao, “Hierarchical graph attention network for temporal knowledge graph reasoning,” Neurocomputing, vol. 550, p. 126390, Sep. 2023, doi: 10.1016/J.NEUCOM.2023.126390.

J. Wang, K. Yue, L. Duan, Z. Qi, and S. Qiao, “An efficient approach for multiple probabilistic inferences with Deepwalk based Bayesian network embedding,” Knowledge-Based Syst., vol. 239, p. 107996, Mar. 2022, doi: 10.1016/J.KNOSYS.2021.107996.

J. J. Zhu and Z. J. Ren, “The evolution of research in resources, conservation & recycling revealed by Word2vec-enhanced data mining,” Resour. Conserv. Recycl., vol. 190, p. 106876, Mar. 2023, doi: 10.1016/J.RESCONREC.2023.106876.

M. Ay, L. Özbakır, S. Kulluk, B. Gülmez, G. Öztürk, and S. Özer, “FC-Kmeans: Fixed-centered K-means algorithm,” Expert Syst. Appl., vol. 211, p. 118656, Jan. 2023, doi: 10.1016/J.ESWA.2022.118656.

M. Z. Islam, V. Estivill-Castro, M. A. Rahman, and T. Bossomaier, “Combining K-MEANS and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering,” Expert Syst. Appl., vol. 91, pp. 402–417, Jan. 2018, doi: 10.1016/j.eswa.2017.09.005.

F. D. Bortoloti, E. de Oliveira, and P. M. Ciarelli, “Supervised kernel density estimation K-means,” Expert Syst. Appl., vol. 168, Apr. 2021, doi: 10.1016/j.eswa.2020.114350.

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci. (Ny)., vol. 622, pp. 178–210, Apr. 2023, doi: 10.1016/j.ins.2022.11.139.

W. Huang, L. Li, H. Liu, R. Zhang, and M. Xu, “Defense resource allocation in road dangerous goods transportation network: A Self-Contained Girvan-Newman Algorithm and Mean Variance Model combined approach,” Reliab. Eng. Syst. Saf., vol. 215, p. 107899, Nov. 2021, doi: 10.1016/J.RESS.2021.107899.

Z. Jiang, H. Zhong, and N. Meng, “Investigating and recommending co-changed entities for JavaScript programs,” J. Syst. Softw., vol. 180, p. 111027, Oct. 2021, doi: 10.1016/J.JSS.2021.111027.

T. Azar, “Wikipedia: One of the last, best internet spaces for teaching digital literacy, public writing, and research skills in first year composition,” Comput. Compos., vol. 68, p. 102774, Jun. 2023, doi: 10.1016/J.COMPCOM.2023.102774.

Wikipedia Milestones. Accessed 20 May 2023. Accessed from https://meta.wikimedia.org/wiki/Wikipedia_milestones

A. A. Maulana, A. Susanto, and D. P. Kusumaningrum, “Rancang Bangun Web Scraping Pada Marketplace di Indonesia,”

Downloads

Published

2024-05-13

How to Cite

Pratama, Y. B., & Haiyudi, H. (2024). Knowledge Graph Analysis On English Wikipedia Pages Using A Deep Learning Algorithm. IT Journal Research and Development, 8(2), 175–186. https://doi.org/10.25299/itjrd.2023.13459

Issue

Section

Articles