Twitter Sentiment Analysis and Its Effect on Society on the Fall of Cryptocurrency

ABSTRACT


INTRODUCTION
Cryptocurrency, or digital currency, has become a global phenomenon in many countries, such as the United States, Japan, and China. In Indonesia, cryptocurrencies are still experiencing pros and cons in terms of regulation and the legality of their use. The Indonesian government, in this case, the Commodity Futures Trading Regulatory Agency (CoFTRA), is continuing to review and issue several regulations regarding cryptocurrency assets.
According to Gormantara (2020), sentiment analysis is a field of science that can build a system for recognizing and extracting opinions in text form using Natural Language Processing (NLP). With this sentiment analysis, sentences or words expressed through social media can be grouped into positive, negative, or neutral responses. In addition, this sentiment analysis can also be used to assess the conditions in the community and later can be used as a guide for making a decision or policy [2]. Sentiment analysis is highly developed and has been widely discussed in various research journals, including research conducted by Nurfidah Dwiyanti (2021) with the title "New Normal Twitter Sentiment Analysis". This research aims to classify people's sentiments toward new habits or the New Normal using sentiment analysis. This research shows that positive sentiment is greater than negative and neutral sentiment. The results for the positive sentiment were 57%, the negative sentiment was 8%, and the neutral sentiment was 35% [3].
The current research is crawling data using the Tweepy library in Python. For classification, use the Textblob library in Python and for the search key on Twitter, use English. So based on the description above, the purpose of this research is the application of sentiment analysis to classify public views on the fall of crypto through social media Twitter into positive, negative, and neutral sentiments.
This research's expected benefit is knowing society's response to the fall of crypto. With this sentiment analysis, it is also a view for the public to be careful in investing public funds into crypto.

RESEARCH METHOD 2.1. Method of collecting data
The data collection method used in this research is collecting the sentiment of Twitter users with the hashtag crypto crash using Python. The stages carried out in this research consist of three steps: data crawling, preprocessing and classification results, and visualization of results. The description of the flow of the research is as follows: Based on Figure 1 above, the first stage is to collect public opinion or views on Twitter social media. The second stage is cleaning the data before the data is analyzed further through filters to remove retweets containing duplication and res.sub to delete urls using regular expressions accessed from a predetermined character set and others. As for the classification of sentiment analysis in this study using the Textblob library in Python. The library identifies opinions in the text data about the subject in the form of sentiment (polarity, subjectivity).

Python
Python is one of the most widely used programming languages programmers or programmers in making their programs. Python has characteristics that are not too complicated. So that Python has become one of the languages Easy-to-use high-level programming. In writing a code program using the Python programming language, some rules must be followed to anticipate errors or problems in the program created. The first Python syntax rule is in writing statements or commands.

RESULTS AND ANALYSIS
The data collection results carried out in this study were from a tweet data crawler from the Twitter API using the Tweepy library. This Tweepy library makes it easy to get data on Twitter from users based on the keywords used. The keyword used is a variable with the hashtag crypto crash (#cryptocrash) using English in the last 30 days from 28 th September 2022. An illustration of the results of the sample data crawling results can be seen in Figure 2. API.search_30_day can retrieve tweet data from the previous 30 days by querying #cryptocrash. With the standard API format, a maximum of 1 execution can only retrieve 100 tweets. Therefore, additional pagination is needed to retrieve 3000 tweets, namely tweepy.Cursor(method,*args,*kwargs) to produce 3000 tweets.
After collecting data, the next process is to process the data preprocessing Twitter data. The data preprocessing process that was done by cleaning the data includes deleting URLs and hashtags using regular expressions. In a function defined with: re.sub("([^0-9A-Za-z\t]) | | (\w+:\/\/\S+) "," " on tweet status. The function re.sub is used to replace the character selection specified using regular expressions, to find all strings, such as URLs and hashtags, replaced with -"". The results of the preprocessing data process can be seen in Figure 3.

Figure 3. Preprocessing Data Results
In addition, in the data preprocessing process, the process of changing all the letters in the tweet data into lowercase letters is also carried out with the set(lower_case) function. The results of the process of changing lowercase letters can be seen in Figure 4. After making the process of changing lowercase letters on the Twitter data, the final process at the stage of the data preprocessing process is to do a retweet filter. The retweet filter removes retweets that contain the same content (duplicates). The results of the retweet filter can be seen in Figure 5.

Figure 5. Retweet Filter Results
After the data preprocessing process, the next step is the classification and visualization results. For sentiment analysis in this study, Textblob is used to identify public opinion on the fall of crypto from tweet data. The polarity value in sentiment analysis is between 1 to -1. The polarity value leading to a value of 1 indicates a positive opinion, the polarity value leading to a value of -1 indicates a negative opinion, and a polarity value ranging to 0 indicates a neutral opinion. The results of the sentiment analysis classification can be seen in Figure 6. In Figure 6 above, it can be seen that the results of the sentiment analysis of the polarity value with the #cryptocrash variable. The highest polarity value is found in neutral status of 57.9%, positive polarity of 35.7%, and negative polarity of 6.4%.
To show some samples from the tweet data set from sentiment analysis with the #cryptocrash variable with positive, negative, and neutral categories and their polarity values, see Table 1.

CONCLUSION
Based on the results and discussion generated in this study, it can be concluded that the data used was obtained using the Tweepy library on Python by taking tweet data using the #cryptocrash variable from the last 30 days from 28 th September 2022. From the data collected, there is a sentiment analysis result of 57.9% yielding a neutral value, a value of 35.7% producing a positive value, and 6.4% producing a negative value.
It is expected to extend this model by increasing the dataset and comparing our method with other machine learning in the future.