
In the proposed methodology, Bag of Words approach is applied for performing feature engineering where these features are given as input to the hybrid model Ensem_SLDR. The objective of this paper is to implement a robust technique to categorize cybercrime into two sections, 66 and 67 of IT Act 2000 with high precision using ensemble learning technique. The focus of this research is to construct a model, Ensem_SLDR which can predict the relevant sections of IT Act 2000 from the compliant text/subjects with the aid of Natural Language Processing, Machine Learning, and Ensemble Learning methods. Due to the accumulation of an enormous quantity of cybercrime data, there is huge potential to analyze and segregate the data with the help of Machine Learning. With the advancement of technology, cybercrimes are surging at an alarming rate as miscreants pour into the world's modern reliance on the virtual platform. Experiments on the dataset show that extremely randomized trees with word-embedding vectors as input achieved 85.66% F-score Also, the imbalanced dataset problem is solved by oversampling.


Then, for classification task, decision tree and decision tree based ensemble classifiers such as Adaptive Boosting, Bootstrap Aggregating, Random Forest, Extremely Randomized Decision Tree and Extreme Gradient Boosting algorithms are used. To cope with these drawbacks of the short texts, semantic word expansion based on concepts and word-embeddings vectors are proposed. As short texts do not include enough statistical information have drawbacks. In this article, detecting offensive language in short text messages on Twitter is aimed. Consequently, offensive language has become as a big problem for both social media and its users. However, in addition to the conveniences it provides, some problems have been emerged because content sharing is not bounded by predefined rules. Sharing content easily on social media has become an important communication choice in the world we live. Consequently we compared our ensemble model with traditional classification algorithms and observed that the F-measure value is increased. Then the expanded and the original form of the messages are included in an ensemble learning model. In the proposed model first the short messages are expanded with BabelNet which is a concept network. In this paper a sentiment classification model for Twitter messages is proposed to overcome this difficulty. This situation makes the sentiment classification of social media texts more complex. The users generally express their opinions by using emoticons, abbreviations, slangs, and symbols instead of words. However social media limits the size of user messages.

Today, people are chatting with their friends, carrying out social relations, shopping and following many current events through the social media. With the widespread usage of social media in our daily lives, user reviews emerged as an impactful factor for numerous fields including understanding consumer attitudes, determining political tendency, revealing strengths or weaknesses of many different organizations.
