Addressing Cold Start New User in Recommender System Based on Hybrid Approach: A review and bibliometric analysis

ABSTRACT


INTRODUCTION
Nowadays the internet has become a very important necessity for human life.As a result of this, internet users continue to increase every year.In Indonesia, in 2019-2020 (Q2) internet users have reached 196.71 million people, or about 73.7% of the population in Indonesia.This number is an increase from the previous year 2018 which reached 171.17 million users [1], [2].Meanwhile, the use of the internet is used for various activities, one of which is buying and selling online.
The use of the internet as a means of buying and selling online has become a frequent thing.Based on APJII survey [1] activities related to online buying and selling became quite frequent activities carried out on the utilization of the internet in the economic sector in 2018.More specifically, 37.82% was used for "information of buying" activities, 32.19% for "buy online", and 16.83% was used for "online selling".While in 2016 there were about 84.2 million people or about 63.5% who claimed to have transacted or shopped online.The data shows that online buying and selling activities in Indonesia have reached a very large number.
Based on this fact, the online buying and selling transaction system in the form of websites contributes greatly in presenting information to the public.Furthermore, one of the things that plays an important role in the success of that system is the recommendation system.An example of a real contribution due to the implementation of a good recommendation system is to the Amazon company.Approximately 35% of products purchased from Amazon are the result of a recommendation system [3].Meanwhile, Helly Hansen, which sells products such as water ski sports products and raincoats, increased by 170% and 52% from new buyers due to the use of a good recommendation system [4].In addition, in other areas, namely the film industry, 75% of the films watched by Netflix viewers are the result of the recommendation system offered [3].Based on these examples, a good recommendation system can provide benefits for users/buyers and online business owners themselves.
Simply put, the recommendation system can be said to be a system that provides relevant product recommendations based on user preferences.Some approaches can be used to build recommendation systems.However, commonly used approaches are content-based and collaborative filtering.Content-based is an approach that provides recommendations for users based on the similarity of characteristics/attributes of items that users liked before.By using metadata/description on certain items, we can find out how similar certain items are to other items so that we can provide recommendations of relevant items for users.Meanwhile, the collaborative filtering approach provides recommendations for users based on ratings from other users that are similar to the target user on a particular item.By using this approach, the recommended items will be influenced by the rating of other users, which is what the collaborative term refers to.Furthermore, collaborative filtering approach is more widely used because it has advantages.Among other things, we do not need to define meta-data, characteristics or detailed descriptions of certain items such as content-based filtering.
Although it has been widely used, collaborative filtering still has some problems.Among them are problems of cold start, data sparsity, dynamic taste, and gray sheep.Cold start is a condition where both the new user and the new item enter the system, while there is no historical data about the user or the item.Meanwhile, data sparsity problems related to stretch data caused by users only fill in ratings for certain items.Meanwhile, dynamic taste is a situation where users have different interests in the type of item at certain times.For example, in certain time a user has an interest in certain items/categories, but sometime later there is a change in taste so that different recommendations are needed.On the other hand, there is also the problem of gray sheep when a user has unique tastes so it is difficult to find a match with other users [5].
Several studies have tried to resolve the problems in the previous paragraph.The purpose of solving the problem here is the treatment/method used when dealing data with the condition of the problem.Data sparsity is the most discussed issue to solve.Various approaches are used to address the problem of data sparsity.Research [6], [7] using a graph approach.While the study [8] used k-nearest neighbors techniques and threshold-based neighbors on the MovieLens dataset.On the other hand, [9] proposed a matrix factorization model called Enhanced Singular Value Decomposition (ESVD).Matrix factorizers are also often used to solve data sparsity problems.As for cold start problems, Verma in her research used collaborative filtering and fuzzy c-means clustering.To overcome the cold start new user, this study directed new users to rate a number of items to get the appropriate recommendation [10].But this actually does not solve the problem properly because it only asks new users to fill in the rating of certain items.Furthermore, Verma uses item-based collaborative filtering to solve cold start new items.
On the other hand, some studies have tried to solve the problem of dynamic taste and gray sheep.Research [11] presents hybrid techniques to solve dynamic taste and gray sheep problems.In addition, this study also solves some problems in data sparsity and cold start.In this study built 7 blocks consisting of dynamic content builder, user similarity finder, item similarity finder, collaborative classifier, dual purpose opinion miner, matrix factorizer, and collective recommender to solve problems in collaborative filtering.However, the study still has some problems left.In the case of cold start new user, actually the method in the study has handled it with matrix factorizer block and item weight.But it will produce the same results for all users so that the resulting personalization is still lacking.In addition, the use of opinion mining methods for rating calculations in research [11] also cannot be used in general datasets.This is because the opinion mining method (prediction rating) used gives a fairly good result only on the data with short comments/reviews such as on the website myopinions.in.Whereas some datasets or websites present long reviews and comments so that other methods are more appropriate needed.In this paper, we only focus on the problem of cold start of new users.
Based on the remaining problems in the previous paragraph, this research will later improve the research [11] by improving the rating calculation on the Dual-purpose Opinion Miner block, as well as solving cold start problems for new users.However, in this preliminary research, bibliography analysis will be conducted to find solutions to cold start problems.

Recommendation System
A recommendation system can be interpreted as a system that dynamically filters important information from a large set of information based on a user's interests, profiles, or preferences for specific items [12].The use of recommendation system has been widely used nowadays because it can save effort to search for goods suitable for users on various fields.This can certainly benefit users to get items that are relevant to a particular product.On the other hand, it also benefits the seller of the product so that the number of sales increases.In the meantime, there are several approaches that can be used in building recommendation systems.Isinkaye [12] reveals three common approaches that are often used: collaborative filtering, content-based filtering, and hybrid filtering.Collaborative filtering techniques are the most mature and most widely used approach, where they provide recommendations by identifying users with similar tastes using their ratings.On the other hand, content-based filtering match the source of the content to the characteristics of the items that previous users liked.In this approach, predictions are made based on user information on content characteristics by ignoring other users' contributions.While hybrid approaches use two or more approaches to deliver better results.Fig. 1 describes the techniques on the recommendation system.

Research Related to Collaborative Filtering Recommendation System
Collaborative filtering has been used in various studies.However, collaborative filtering still has some problems, including cold start, which is the condition when there are users and new items entering the system, while historical data from the user or item is not there so we have difficulty determining the appropriate recommendations.In addition, there is also the problem of data sparsity where this happens when the user only fills in a few ratings against the item, so we have difficulty to determine similar users due to the stretched data.There is also the gray sheep problem.This happens when there are users with unique tastes that we have difficulty finding a match with other users.The last is the problem of dynamic taste.Most existing approaches do not take into account the dynamic aspects of taste.For example, user A on a given day may have good taste in category K1, but a few months later there was a change in taste where user A began to like other categories, such as K2.This will certainly affect the recommended items for that user.
Existing studies use a hybrid approach, in which more than one technique is used to get better results.Meanwhile, some studies have tried to solve the above problems with various approaches.However, most studies only deal with the problem of data sparsity.The research [6] builds a graph of user interest represented by hierarchical tree structures covering a wide range of topics, three-tiered interest topics from coarse-grained to fine-grained.While the study [8] used knearest neighbors and threshold-based neighbors techniques on the MovieLens dataset where the results will be obtained by the most optimal neighbors thus improving the accuracy of recommendations for users.Meanwhile, the study [13] proposed a deep learning framework that addresses the issue of deep collaborative hashing codes in user-item ratings, which adopt neural networks to better study user and item representations and make them approach binary code so that quantization losses are minimized.Meanwhile, Gupta and Kumar [14] proposed a heterogeneous information network-based recommendation model called HeteroPRS for personalized top-n recommendations using binary implicit feedback.To harness the potential of meta-information related to items, they use the concept of meta-path.It then leverages the popularity of items and user interest simultaneously to improve the effectiveness of recommendations.In contrast to Du et al. [15] which introduced trust relation computing in the field of sociology.The 'trust' in question is an integrated trust for nearest neighbor selection.Trust networks are built by the expansion of the length of different lines, and the value of trust between users is obtained by the rules of trust transmission.
Papers [16]- [19] has proposed several approaches to addressing cold start problems.The study [16] proposed a Friend Recommendation System that combines Big-Five Personality Traits models and hybrid filtering, where a friend's recommended process is based on personality traits and user harmony ratings.Similar to Ning's work, Dhelim et al. [20] also uses Big-Five personality traits plus dynamic interest in its user interest mining system.Herce-Zelaya et al. [21] presents profiles of user behavior using social media based on classification trees and random forests to create predictions and address cold start problems.Wang's research et al. [22] introduced neighboring factors and time functions as well as utilizing dynamic selection models to select adjacent sets of objects.Meanwhile, Jipmo et al. [23] proposed Frisk, an unsupervised multilingual system for the classification of twitter users' interests.
Ahmedian et al. [18] proposed a social recommendation system based on implicit social relationships, they have used Dempster-Shafer theory to model the implicit relationships and introduced a new measure for unreliable predictions using neighborhood improvement mechanism.Trikha et al. [24] studied the possibility of predicting the users' implicit interests based on only topic matching using frequent pattern mining without considering the semantic relatedness of the topics.Zarrinkalam et al. [25] introduced a graph-based link prediction system that works over a representation model constructed by three classes of information: user explicit and implicit contributions to topics, relationships among users, and the similarity between topics.Ebrahimi and Golpayegani [26] use similar social relationships between users with Jaccard coefficient.As for making social network information, Ebrahimi and Golpayegani conduct historical considerations of user ratings.Meanwhile, Katarya [27] uses demographic filtering and psychographic attributes such as lifestyle, interests, and personality.
The authors of [28] proposed an approach to predict the interests of new users or inactive users based on different social links between users, they introduced a random walk based mutual reinforcement model that incorporates both text and links.Zarrinkalam et al. [29] introduced a framework that operates based on the temporal evolution of user interests and uses extracted semantic information from knowledge bases such as Wikipedia to predict user future interests.Veličković et al. [30] introduced graph attention networks (GATs), a new neural network architecture that works on graph-structured data, by leveraging masked self-attentional layers, GAT address the drawbacks of conventional schemes based on graph convolutions.Sadeghian et al. [31] presented a neural network architecture that learns the vector representations of hotels by incorporating different sources of data, such as user clicks, hotel attributes (e.g., hotel type, property star rating, average user rating), addition services information (e.g., free Wi-Fi and free breakfast), and location information.During model training, a joint embedding is learned from all of the above information.Sun et al. [19] introduced a method for knowledge graph embedding named RotatE, which is capable to model and infer different relation patterns including: symmetry/antisymmetry, composition, and inversion.Specifically, the RotatE represents each relation as a rotation from the source entity to the target entity in the complex vector space.Fig. 2 Block diagram on recommendation system [11] Meanwhile, only a few studies have discussed the issue of dynamic taste or gray sheep.Chen et al. [32] proposed a new personalized recommendation algorithm called Attention Flow Network based Personalized Recommendation (AFNPR).On the other hand, Zhang et al. [33] propose to dynamically select negative training samples from the ranking list generated by the  [11] in his research built 7 hybrid techniques blocks where each block has a different function.The seven blocks are dynamic content builder, user similarity finder, item similarity finder, collaborative classifier, dual purpose opinion miner, matrix factorizer, and collective recommender.Tewari and Barman's research presents quite good results because it has solved the problem.For more details about the 7 blocks can be seen in Fig. 2. While Table 1 shows various studies with problems that have been solved.

Recommendation System Evaluation
There are several evaluations that can be used for the recommendation system.Isinkaye divides the matrices to measure accuracy commonly used in two categories, namely statistics and decision support accuracy.Statistical accuracy matrices are used by comparing predicted ratings against the user's original rating.The statistical matrices commonly used are Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and correlation.MAE where indicated by Eqn 1 [12].
where  #,% is the predicted rating of users u against item i, while  #,% is the real rating of users u against item i.While RMSE is indicated by Eqn 2.
On the other hand, commonly used matric decision support accuracy are Reversal rate, Weighted errors, Receiver Operating Characteristics (ROC) and Precision Recall Curve (PRC), Precision, Recall and F-measure.Precision is a comparison of recommended items that are actually relevant to the user.While recall is a comparison of relevant items that are also part of the recommended set of items.The f-measure combines precision and recall values into one value.
To calculate precision and recall we can create confusion matrices such as Table 2 where there are four values of matric 2x2 [34].True Positive (TP) means the appropriate relevant item recommended to the user.True Negative (TN) is an exact item that is not recommended for users.False Positive (FP) is a recommended item but should not be recommended.While False Negative (FN) is an item that is not recommended but should be recommended.
While f-measure is indicated by Eqn 5.

HYBRID TECHNIQUES WITH 7 BLOCKS
The study [11] proposed hybrid techniques using seven different blocks to solve collaborative filtering problems.Fig. 2 shows 7 blocks used to process recommendations for users.

Dynamic Content Builder (DCB)
DCB block build user profiles using a content-based filtering approach.By using this block, the system can solve dynamic taste problems by forming Dynamic Keywords Vector (DKV) to store the keywords of items purchased with time data.If the user buys items in the same category/keyword, then the value in the database will increase by 1 and the time data will be updated.Table 3 shows examples of DKV in table form.In addition, DCB block also generate item vector description for each item that contain detailed information about the product.The information is filled in by the seller of the item by category.Item description vector will be used by ISF block to get similarities between items.

User Similarity Finder (USF)
After the profile of each user is created, it will be obtained matrix user-keyword.After that USF block will calculate the weight for each value in user-keyword matrices with Eqn 6.
6% =  %,6 ×  % (6) where  6% is the weight for user j and keyword i, while TF is Term Frequency and IDF is Inverse Document Frequency, then a cosine similarity calculation is performed between the user vector with Eqn 7 below:

Item Similarity Finder (ISF)
This block is used to solve cold start problems in new items, which will calculate the similarities between items with Eqn 8 below: In determining similarities between items, ISF block use item description vector generated on previous DCB block.

Collaborative Classifier (CC)
This block generates recommendations with a collaborative filtering approach where CC block determines the k nearest neighbor of the target user by using the USF block.This block does not generate a rating on items that the target may like but will classify items that the target may like or may not like.
To predict the rating for the target user will be used Eqn 9, where  ,,> is the target user rating x against item k.While  , is the average rating of the target user x.  ?> is the rating of the neighbor m to item k,  ? is the average rating of the user m, (, ) is the similarity between the target user x and the neighboring user m and c is the number of neighbors of the target user x.

Dual Purpose Opinion Miner (DOM)
There are two main objectives of this block: • Calculate the sentiment score of each item review to then be converted into a rating.
• Calculate the popularity of different items by calculating the weight for each item.
To calculate sentiment, reviews are separated per word, and then adjectives in reviews are included in a function that using SentiWordNet to determine the overall value of sentiment.If there are negative words such as "not", "never", "no" and others near the adjective, then the adjective value is changed to negative.After that, the rating calculation is done using Eqn 10 for positive sentiment and Eqn 11 for negative sentiment.

𝑅𝑎𝑡𝑖𝑛𝑔 = M𝑅
Next is to calculate the item weight.Each item is calculated Average Review Rating (ARR) and the number of users who have reviewed it.Eqn 12 is used to calculate item weight, where  ?4, is the highest scale of sentiment score,  % is the number of users who give their review, ARR is the average rating of each item and  ?4, is the highest number of  % for each item.

Matrix Factorizer (MF)
Matrix Factorizer block takes rating input from DOM block and made matrix rating like Fig. 3. To predict the rating of an unknown item use Eqn 13.
where  % is a parameter vector for the user i-th against item feature  6 , while  6 is a weighted vector against the hidden feature of item j.Then by minimizing squared error function with gradient descent and regularization, then Eqn 14 and Eqn 15 are obtained. )

Collective Recommender (CR)
CR block generated recommendations for users.The CR block will take input from the CC, DOM, and MF blocks for Eqn 16.
#% =  ! % +  &  % +  P  % (16) where  !,  & , and  P are weights with a range of values between 0 and 1.Meanwhile, to solve the problem of cold start in the new item used Eqn 17 to get the final value of the new item  ! for user u.
=J/?G%? * !,* " ∑ =J/?G%? * !,* " Next, a sort is performed based on the final value of FRV to get the items to recommend to the user.Sorting is done from the highest value to the lowest, then selected top-n items to recommend to the user.

Problems in 7 Blocks of Hybrid Techniques
Based on the exposure of the 7 blocks of hybrid techniques above, this method has given quite good results.But the study still has some drawbacks.Among them the use of opinion mining method (prediction rating) is not appropriate.The opinion mining method used cannot provide an appropriate representation for long sentences in reviews.This is because the technique used uses the accumulative adjective of the review to find out the sentiment score of the review.Whereas reviews with many positive adjectives are not necessarily positive reviews because perhaps they only compare with other better products with positive reviews in those reviews.On the contrary, with reviews that are many negative adjectives, perhaps the user only conveys examples of other products that are negative in the review, so it should be a positive review for the target item.
In addition, the opinion mining method used gives too far weight between short and long reviews.For example, short and long reviews actually have the same weight in assessing a particular item, but a long review will have a much greater weight because it has more adjectives than short reviews, though the overall content is the same weight.This causes an imbalance in rating calculations that becomes too far between short and long reviews.Thus, negative reviews may have a higher rating compared to short positive reviews, should have had a bigger rating even though it was only a short review.
In addition, it also refers to the dataset used in the study which uses very short reviews, while some datasets or websites present a fairly long reviews.Therefore, the use of other more appropriate reviews can be tried for different datasets.Figures 4 show comparisons of reviews used in research [11] and reviews on the tripadvisor web.On the other hand, research [11] has not yet resolved the cold start problem thoroughly.This is because research [11] only uses approaches that solve cold start problems in new items, while new users are not resolved.If there are new users, the blocks that have the most effect are the item weight and matrix factorizer blocks.Thus, the approach is similar to the approach that provides recommendations based on the popularity of items, whereas if there is social information from the user, then we can use it as a consideration of recommendations.Therefore, additional methods are needed to solve this problem.

BIBLIOMETRIC ANALYSIS a. Scopus Database
Scopus is an indexing database of reputable international scientific publications.It is an abstract database and excerpts from peer-reviewed results of scientific journal literature, books, and conference proceedings.This database provides comprehensive information or overview of various research results in the world.

b. Bibliometric
Bibliometric analysis in information science is a study that can reveal the pattern of document utilization, development of literature or sources of information in a subject area.Bibliometric includes two types of studies, namely descriptive studies and evaluative studies.Descriptive studies analyze the productivity of articles, books, and other formats by looking at authoring patterns such as author's gender, author's type of work, level of collaboration, author productivity, institutions where the author works, and the subject of the article.Evaluative studies analyze the use of literature created by calculating references or citations in research articles, books, or other formats [35].

c. Mapping based on Co-Word
Co-word-based mapping is a mapping based on the co-occurrence of important or unique terms contained in the article and can be seen by looking at the title or abstract only.The term derived from the analysis of the subject represents a concept.

d. VOSViewer
VOSViewer is a software used to visualize bibliographies, or datasets containing parts of a bibliography, such as article titles, author names, journals, etc.In the world of research, VOSViewer is used for bibliometric analysis, searching for topics for which there are still opportunities for research, looking for the most widely used references in a particular field and so on.

e. Datasets
The data used is meta-data extracted from the Scopus database with the keyword is "graphbased recommendation system for cold start" from 2004 to 2020.All information is exported to CSV format for data analysis purposes, especially word co-occurrence network generated using VOSviewer.The number of document meta-data collected successfully was 584 articles.

f. Network Visualization
The result of visualization of co-word map network based on keyword "graph-based recommendation system for cold start" is divided into 8 clusters as seen in Figure 5. Nodes represented by circles can be includes publications, journals, researchers, or keywords.While edge indicates a relationship between node pairs.In addition, edge not only indicates the relationship between the two nodes, but also the strength of the relationship represented by distance.The closer the distance between nodes to each node indicates the high connection between the nodes.The larger the text, the greater the intensity.
• Cluster 1: The red color consists of 30 items of which are collaborative filtering, filtering, personality, hybrid approach, new user problem, sparsity problem, top n recommendation, traditional method, and user item matrix.

g. Overlay Visualization
This visualization is a research trend mapping based on the year the article was published.At the bottom right of Figure 6 shows the color description, the darker the color on the node, the longer the topic is discussed in the research.The closeness between words indicates the closeness of the relationship between the two nodes.The larger the text, the greater the intensity.For cold start problems some researchers associate it with several terms including collaborative filtering, low accuracy, social recommendation, social network, poi, deep learning, and knowledge graph.Judging by the color, the discussion on cold start issues is still warm to research.

h. Density Visualization
This visualization shows the level of saturation (solid) indicated by the number of keywords that often appear marked in yellow.In Figure 7 it can be seen that the brighter the color, the more this term is discussed by previous researchers.The closer between words, the closer the relevance.Unconnected terms could be the next research opportunity.

ANALYSIS AND DISCUSSION OF EXISTING METHODS
Exhaustive survey of cold start shows that there are several methods or techniques have been proposed to address the cold start problem [36]- [40].Cold start problems are categorized into two types, cold start new user and cold start new item.Here we summarize the solutions proposed by the researchers.Then expose future challenges on cold start problems.
Most papers either use technologies that already exist in the literature (such as k-means), or propose to extend them, or use a combination of algorithms to obtain new hybrid algorithms (such as singular value decomposition and deep learning).In a more detailed view, [9], [36], [41]- [45] using matrix factorization techniques as singular value decomposition (SVD) or probabilistic matrix factorization (PMF) or new proposals based on these techniques.[46]- [49] use hybrid techniques, and [50]- [53] use k-means or its extension.Based on this result, the technology based on the matrix factorization (MF) model seems to be the main method used.
The matrix factorization model converts the characteristics of users and items into potential factor spaces and predicts the user's evaluation of the items by calculating the similarity between user interests and target items [9].
K-means clustering is a partitioning method that can divide the data set of items into several subsets according to a given distance metric, where the items constituting the subset are as close as possible to each other [54].
Looking at the many solutions introduced by the researchers, implying that the problem is not new.However, the new and most challenging task related to cold start concerns computational time in determining recommendations for new users, due to the rapid increase in data volume.

CONCLUSION
To build a recommendation system that can solve the problems of cold start, data sparsity, dynamic taste, and gray sheep, it is necessary to use hybrid methods that combine more than one technique.The use of each method in hybrid techniques is carried out to provide specific treatment when facing data problems with cold start state, data sparsity, dynamic taste, and gray sheep.Meanwhile, research [11] has proposed 7 blocks of hybrid techniques that can solve these problems.Furthermore, 7 techniques blocks were used namely DCB, USF, ISF, CC, DOM, MF and CR.
The rating calculation method in the study [11] cannot be used in general datasets which have varying lengths of reviews with very long or very short lengths.So, it can decrease the quality of the recommendation system.In addition, research [11] has not yet resolved the problem of cold start thoroughly.This is because research [11] only uses approaches that solve cold start problems in new items, while in new users it is not resolved.Since the approach used only provides recommendations based on the popularity of the item, whereas if there is social information from the user, it can be used as a consideration of the recommendation as seen in Figure 6.Associating cold start problems with social media information and knowledge graph can be tried to solve this problem of cold start new user.Further work will be carried out on the utilization of knowledge graph-based recommendation systems that use social networking data to solve cold start problems.

Table 1 .
Comparison of Research on Solved Problems

Collaborative Classifier Review Rating Calculator Item Weight Calculator Dual purpose Opinion Miner Matrix Factorizer Collective Recommender current
prediction model and repeatedly update the model.The results can reduce training time and lead to significant performance improvements.Then Tewari and Barman

Table 3 .
Examples of DKV

•
Cluster 2:The green color consists of 27 items of which are baseline method, accurate recommendation, future research, machine learning, new algorithm, probability, social network, tree, and user interest.•Cluster 3: The blue color consists of 22 items of which are cold start scenario, deep learning, mae, rmse, social graph, user item interaction, user profile, popular method, and item similarity.• Cluster 4: The yellow color consists of 16 items of which are data sparsity problem, higher accuracy, historical rating, reliability, trust information, and novelty.• Cluster 5: The purple color consists of 13 items of which are additional information, low accuracy, neighborhood, svd, user similarity, and prediction accuracy.• Cluster 6: The light blue color consists of 13 items of which are cold start issue, poi, matrix factorization, social relationship, trust relationship, and user satisfaction.• Cluster 7: The orange color consists of 11 items of which are association, data mining, lda, knowledge graph, text mining, recent research, and user behavior.• Cluster 8: The brown color consists of 6 items including rank, rating data, heterogeneous network, novel approach, personalized recommendation system, and user item rating matrix.