Multi aspect, Text classifications, Sentiment Analysis.


Sentiment analysis (SA) is an continuing field of research in opinion mining. It trying to reach to clear opinion form text files and classify it to (e.g. .positive, negative or neutral), Sentiment analysis has take an important place in the evolution of social media, most of people now days depend on social media to know any information about a product or service, while people also use social media to put across their opinions about products and service, though we need more research in how to mining those opinion9. Most researches of Sentiment analysis based on English data set and researches on Arabic sentiment are few [1-3], Arabic is an important and wide spread language compared to French and Chinese [4-5].

Moreover, publicly-available Arabic datasets are seldom found on the Web. We starting by using an educational Arabic data set and try to determine aspects on to examine it further like (importance of every subject, percentage of this subject to the real work .etc). a lot of challenges make handling Arabic text more difficult than English data set such as few number of resource , using of Arabic dictates and lack of preprocessing tools [5-6]. To specify the sentiment analysis for any domain, the process starts with collecting data from social media then load it to create the data set then starts with preprocessing that include (stop words removing, normalization, stemming) that illustrated in figure 1, in this step we ready for making classification that labeled the documents according to its content and extract its sentiment orientation.

Fig 1 Process of Sentiment Analysis [7-8]

This paper shows an enhanced technique for sentiment analysis based on multi aspect for Arabic data language.

Towards this end, we make the following contribution

  1. Collect an Arabic social data set
  2. Create WorldNet that used for semantic analysis.
  3. Build more than method for Filtering and text preprocessing (Stemming, Stop words removing) for data preprocessing.
  4. Evaluate different hybrid machine learning algorithms (NB,SVM,HMM) for multi aspect sentiment analysis.

This Paper is organized as follows: Section 2 shows the related work on the sentiment analysis. Section 3 discusses framework for the proposed system. The data set and experimental results are discussed in section 4 .and finally conclusion is described in section 5.

Related work

Sentiment classification aims to distinguish whether people like/dislike a product from their reviews. It has emerged as a proper research area. In this section we try to make an overview of researches on Sentiment analysis for Arabic language [9].

A multi aspect level in English data set for three domains was presented in [10]. The domains are laptops, restaurants and unknown domain , 254 reviews for restaurants 277 reviews for laptops, supervised techniques for both laptops and restaurants are used for sentiment classification while unsupervised technique is used for unknown domain , first they create a list that identify the orientation of every feature in each category by specify <category, ote=””> pair for every opinion is in a given review, if there is a word not related to any aspect we use the next and the previous word to build this relationship. The orientation for every review calculate through three classifiers (MPQA, SentiWordNet and B.Liu Lexicon), for each domain the three classifiers tested separately, they try more than one approach but they collect small data set that result in low accuracy. In the authors [11] identified an approach for determining sentiment opinions that can used in tourism domain, they use data collected form restaurants and hotels to evaluate their approach. Experiment are done in three main steps ,aspect extraction ,Subjectivity Classification and finally sentiment classification, this method achieved precision reach to 90% in specifying the orientation but it can extract only 35% of the aspect expression. In [12] the authors create three approach for aspect ranking according to its affect on the overall review, these approach based on aspect frequency, first frequency based method it ranks all aspect according to its aspect frequency, second correlation-based frequency , the idea in this approach is to calculate the correlation factor between the opinions in exact aspect and the total rating, the aspect ranked according to the number of aspects which have two kinds of opinions which have more than one kind of opinion are dependable, third method is hybrid approach which take both the frequency and the correlation based. In [13] 66,512 reviews were collected from 652 restaurant reviews from, every review is assigned to one or more of the following six aspects: miscellaneous, price, service, time, efficiency and Cleanliness of the place they use two supervised machine learning algorithms. First is Support Vector Regression (SVR) that use LIBSVM library with its default parameters and the other is the perception ranking which is ordinal regression. All previous analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language.

Researches in Arabic sentiment analysis have number of difficulties

  1. The main difficult that writing in social media written in any format including Arabic slang.
  2. Using emotions and idioms that cannot be understood.
  3. Different words may give different sentiment with different domain.
  4. Rare Arabic data set.
  5. Rare natural language processing application (NLP).

In [14] the authors used Arabic dialects, emotions and crowds sourcing approach. They collect 35,000 tweet and divided to positive negative and neutral, they use three machine learning SVM, NB and KNN algorithms. In [15], the authors used only SVM algorithms for Arabic tweets sentiment analysis which get accuracy about 78%. Local grammar is developed approach by [16] they used data about finance ,it achieve precision about 84.2% but with low recall 14.1% . In [17] use YouTube Arabic comments for sentiment analysis they compare between two classification technique(machine learning technique and sentiword lexicon) with accuracy 88.3% and 94.2% . In [18], the authors used facebook 200 comments from 220 post . they use machine learning algorithms for classification(NB,SVM,DS) that give accuracy about 73.3% that is because small dataset . In [19], the authors make the most data set from book reader about 63,000 comment about books , they make classification by rate approach (1,2 for negative, 3 for neutral , 4,5 for positive) achieved about 89.4% with SVM its because they did not use any preprocessing stage which is important for Arabic language. In [20] the authors try to show difference between using preprocessing steps on tweets that write in Arabic dialect ,they collect 1000 tweets (500 positive and 500 negative) they use half of them for building lexicon and assign weight for every word that used for classification further .

The Proposed Framework

The proposed system contains three phases; data preprocessing phase, training phase and classification phase. The next section describe in details each phase. The General framework of the proposed system is shown in figure 2.

Data Preprocessing

In this section, we try to discuss the preprocessing stage include (cleaning data, data division, Normalization, Stemming, Stop words removing) these data used after preprocessing for building classification model that will used in classification process . The next section states each step in preprocessing phase.

Fig 2 proposed system for multi aspect Sentiment analysis


First, normalization is to make all letters write in the same format , for example letter “أ”can be write “ا” or “أ” and the same with other letters . This will make the other preprocessing steps easier.

Stop-words removal

Second, we collected a stop words list from the internet that most sites consider usefulness in search, stop words are words that not affect the meaning like (كان,ان,هو,هى,….. ), removing stop words help in saving time and space. C# code built to extract these words from our data set


Stemming is to return every word to its root , after stop-words removed from data we start applying the Arabic stemming library to the data set by removing the prefix and postfix from every word and get the base, stemming help us in the problem that the same word can wrote in one or more format for example word”أحبت,يحب,تحب,حب” all of this after stemming will convert to is base “حب ” that give us the same meaning [21].

Semantic analysis

WorldNet [22] is a small database of terms that available online. For the English word net, it divided into three groups that called as synonym set. It’s a set of terms that linked to the other terms by more than one relationship that used as synsets list (e.g. is a part of, have –a, is-a, and others). Senti-word net is used for sentiment classification based on the orientation of every term. Each term related to one of the three groups that identify the degree of sentiment orientation (negative, positive, or neutral) .Senti-word downloaded as one file. For each term in this file it has seven fields and score for each field (offset, score of positive, score of negative, part of speech, and score of objective. Offset is an integer value that identify its synonyms in the data set, part of speech that illustrate it is adjective or noun, associated term it is include the group of words that may used instead.we used word net as a synonym set that facilities find the sentiment orientation of every term that not found in the lexicon , it start by analyzing document and divide terms to its class. Words that not found in lexicon automatically start to search to its synonym in the word net as shown in table 1

After finding it synonym we search for its class that facilitate specify the sentiment orientation on the document as shown in figure 3.

Table 1

Terms, Synonyms and Translation

Arabic Term English Term Translated Arabic synonym English synonym Syn (polarity) Excepted polarity
ذكي Clever مجته Assiduous Pos Pos
مثاب persistent Pos
عبقري genius Pos

After finding it synonym we search for its class that facilitate specify the sentiment orientation on the document as shown in figure 3.

Fig 3 process of finding word in word net [23]

Training phase

Machine learning is a part of artificial intelligence .it use for building systems that can predict from data. For example machine learning use for classifying emails to spam and non spam after learning to distinguish them. Training data collected to learn the algorithm and send it the learning algorithm that extracts the rules for classification. After training model has built we can use it in classification for new data as showed in figure 4.

Fig 4 Classification based machine learning process [24]

Naive Bayes

Navie bayes is a machine learning algorithm that used for text classification based on bayes theorem ,NB one of the most efficient algorithm because it is less computationally in memory and in CPU also and need little training data to classify with high accuracy. It work based on word occurrence in the text file and calculate term frequency for every word that used for classification further [25], Using Bayes’ theorem, the conditional probability can be decomposed as


Where ( the probability of instance k part of class and p(k|ck)is the probability of creating instance Ck form class and is the probability of instance p(x) occurring

Support Vector Machine (SVM)

The idea behind support vector machine is trying to find the maximum hyper plane that separate training data according to number of classes . Svm has a main role in sentiment analysis and text classification. For example if there are m of points(x1,x2,…xm) that followed by a label (or target), which in the two class classification we will consider later, will be +1 or -1. -1 representing one state and 1 representing the other as in figure 5. If we have set of training data that belongs to two classified classes.

With a hyper plane

The separating hyper plane must follow these constraints

The distance d(w, b; x) of a point x from the hyper plane (w, b) is,

The optimal hyper plane is given by maximizing the margin, subject to the constraints of Equation The margin is given by,

Hence the hyperplane that optimally separates the data is the one that minimizes

Fig 5 SVM classification approach [26]

Hidden Markov Model (ÁnHMM)

Hidden Markov is one the algorithm that based on probability distribution. It named as HMM because of tow features. A hidden Markov model can be considered a generalization of a mixture model where the hidden variables (or latent variables), which control the mixture component to be selected for each observation a HMM is a generative, probabilistically model where you try to model the process generating the training sequences, or more precisely, the distribution over the sequences of observations. For the sake of clarity, let me refer to the case of text tagging. The goal is to learn to generate text: you give in one or more words, and the system should be able to generate a text on his own. In order to do that, one attempts to learn the grammatical structure of sentences (a verb comes after a noun) and how likely is given words to follow one another, given their lexical category.The corresponding graphical model for an HMM is shown in figure 6.

Fig 6 graphical model for an HMM

In the graph, y(t) is the observation at time (or position) t. In our case, the word y at position t. The hidden state, x, corresponds to its lexical category. It assumes that the observations (for example words) are generated by some hidden state. In addition, it assumes that those hidden states follow the markov property: the current state depends only on the previous state. I.e. whether the next word if a verb or an adjective only depends on the current lexical category.The way this model generates text is the following: given the current word and category, it first samples the probability distribution P(x(t)|x(t−1))P(x(t)|x(t−1)). Given the resulting xx, it samples the probability P(y(t)|x(t))P(y(t)|x(t(( In general a HMM is a generative, probabilistically model where you try to model the process generating the training sequences, or more precisely, the distribution over the sequences of observations. It also allowed different structures to be modeled directly, but it needs to be trained in a set of seed of sequence and require a larger seed than simple markov model [27].

Comparison of machine learning Algorithms

Table 2 present detailed comparison of machine learning algorithms that show the strengths and weaknesses relative to the size of data, fast learning, and its accuracy

Table 2

Comparison of machine learning algorithm

Algorithm Advantages Disadvantages
SVM 1.High accuracy. 2.Good for text classification problems.,3.Can work well even when data isn’t linearly separable 1.Hard to interpret.,2.Kind of annoying to run and tune.
NB 1.So Simple., 2.Need few training data,3.Fast and easy performance. Hard to understand interactions among features.
HMM 1.Easy in interpreting and explanation.,2.Interact with feature interaction easily.,3.Scalable and fast.,4.non-parametric. Don’t support online learning so you want to repeat it with every new example.

Sentiment classification phase

As stated before, the proposed system is based on a hybrid technique for multi aspect sentiment analysis. After data preprocessing phase, data ready for classification .Classification made using different machine learning algorithm (NB, SVM, HMM) , Because each aspect has its own strength that affects the final decision of being positive or negative. For example if we consider price, cleaning and taste of foods are aspects or features of a restaurant .but we cannot treat all features or aspects equally because one user may choose restaurant with a high quality of cleaning regardless the prices ; while the price may be the first concern for other customers. So the proposed system enables the users to decide and give the system impact factor for each aspect to get whether the whole system is positive or negative.

Where F (xi) is the result of machine learning algorithm for aspect k and wk is the weight of the aspect k. the d(i) is the binary classification value of the algorithm ( if positive and 0 if negative ). For each aspect in the test tweet, the result fusion from multiple machines learning algorithms is based on voting technique which combines result from the three machine learning algorithm.

Experimental setup and result

Evaluation Metrics

Accuracy is not only evaluation metrics for testing the efficiency of the used classifier, there are two other factors are (precision and recall, it can give us with more information of the performance of the classifier.


It measure the correctness of the used classifier, it means that as precision increase as false positive decrease (false positive means that documents that classified as positive but it’s neutral or negative).

Precision = TP / (TP+FP) (9)

Where TP True positive and FP is False positive


Recall is also known as sensitivity or completeness, as recall increase as false negative decrease (false negative means that documents that classified as negative but it neutral or positive). Recall the complementary of the precision it means improving recall means increasing precision because it gets increasingly harder to be precise as the sample space increases.

Recall = TP / (TP+FN) (10)

Where FN is False Negative

F-measure Metric

F-measure is a combination between precision and recall , using F-measure is more useful for accuracy and more useless for most classifiers [28]

F-Measure=2*(Precision*recall)/(Precision+ recall)(11)

Data set

We used an educational collected data from three courses in three different years, it express opinions of students in both lectures and labs that identify advantages and drawbacks of these courses. Educational data collected to help us in improving the educational process; we can help in it by automatic analyzing student opinions and fix any draw backs to achieve the best benefit from these courses. We manually collect dataset in the period form 1-11-2015 to 15-1-2016, we collect about 500 comments (both positive and negative), comments include written in both Modern Standard Arabic (MSA) and the Egyptian dialect. We use lexicon to label it into classes (positive and negative) and then start with preprocessing steps which is discussed in point 3.3.

Results for Arabic opinions using Machine Learning

We propose a new hyprid algorthim that combine (SVM&NB),(NB,HMM),(HMM,SVM) by combine rules that use for buliding trianing model,one more advantage in our method is that we able to get effiective classification by littele number of priciple component,the hybrid algorthims first train a first-level individual learn then generate new data set with the extracted rules , an finally testing this model with new dataset.

Table 3 contains the results of algorithms (NB , SVM & HMM) classification for Statements through the work of mixing of these algorithms to get a higher level and improve the accuracy of results.

Table 3

Results of machine learning for Statements

For Statements NB & SVM NB & HMM SVM & HMM
Precision .78 .82 .81
Recall .68 .83 .82
F-measure .72 .82 .81

Table 4 contains the results of algorithms (NB , SVM & HMM) classification For Aspects through the work of mixing of these algorithms to get a higher level and improve the accuracy of results.

Table 4

Results of machine learning for Aspects

For Aspects NB & SVM NB & HMM SVM & HMM
Precision .86 .82 .88
Recall .76 .83 .76
F-measure .80 .82 .82

Table 5 contains the results side by using the weights as the use of weights led to a significant rise in results.

Table 5

Results for Aspects with weight

For Aspects NB With weight SVM With weight HMM With weight
Precision .86 .91 .85
Recall .87 .92 .92
F-measure .86 .91 .89

Notes the results in Table 6 clearly the average accuracy when opinion classification with semantic meaning, and when neglecting semantic in classification process.

Table 6

Total Semantic Orientation of Data set

Not –semantic Semantic
Precision .66 .92
Recall .60 .79
F-measure .63 .85

Table 7

Comparsion with Other Work

Algorithms S.Zafra [10] Bin [13] M.Ali [19] Our Work
SVM 76.7 82.9 0.877 .95
NB 88.9 82.7 0.766 .92
HMM 78.8 81.4 .93

This Table 7 contains the results of our work with other works algorithms results

Figure 7 illustrates the disparity in results between the results obtained by applying algorithms in sentiment analysis and comparing the results of other works.From the literature we find that using most accuracies that use one machine learning is low and using hybrid machine learning algorithms with aspects increase accuracy in table 2 and 3 with percentage 7-9 % and using it with semantic increase accuracy as in table 5 .we start by comparing accuracies techniques and compare our result with recent research, test show that using hybrid classification techniques increase classification accuracies

Figure 7 compare with other work


A new approach for sentiment analysis to improve the relation between students and educational process has been presented. Educational data set have been collected and preprocessing steps have been made.After preprocessing steps, the training model that contain all the reviews and its sentiment, more than one machine learning algorithms applied has been built.