Yazar "Inan, Ali" seçeneğine göre listele
Listeleniyor 1 - 18 / 18
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Deep Learning-based Sentiment Analysis of Facebook Data: The Case of Turkish Users(Oxford Univ Press, 2021) Coban, Onder; Ozel, Selma Ayse; Inan, AliSentiment analysis (SA) is an essential task for many domains where it is crucial to know users' public opinion about events, products, brands, politicians and so on. Existing works on SA have concentrated on English texts including Twitter feeds and user reviews on hotels, movies and products. On the other hand, Facebook, as an online social network (OSN), has attracted quite limited attention from the research community. Among these, SA work on Turkish text obtained from OSNs are extremely scarce. In this paper, our aim is to perform SA on public Facebook data collected from Turkish user accounts. Our study differs from existing studies in terms of the data set scale, the natural language of the texts in the data set and the extent of experimental analyses that include both machine learning and deep learning techniques. We extensively report not only the results of different learning models involving SA but also statistical distribution of metadata of user activities across various user attributes (e.g. gender and age). Our experimental results indicate that recurrent neural networks achieve the best accuracy (i.e. 0.916) with word embeddings. To the best of our knowledge, this is the best result for SA on Facebook data in the context of the Turkish language.Öğe Detection and Cross-domain Evaluation of Cyberbullying in Facebook Activity Contents for Turkish(Assoc Computing Machinery, 2023) Coban, Onder; Ozel, Selma Ayse; Inan, AliCyberbullying refers to bullying and harassment of defenseless or vulnerable people such as children, teenagers, and women through any means of communication (e.g., e-mail, text messages, wall posts, tweets) over any online medium (e.g., social media, blogs, online games, virtual reality environments). The effect of cyberbullying may be severe and irreversible and it has become one of the major problems of cyber-societies in today's electronic world. Prevention of cyberbullying activities as well as the development of timely response mechanisms require automated and accurate detection of cyberbullying acts. This study focuses on the problem of cyberbullying detection over Facebook activity content written in Turkish. Through extensive experiments with the various machine and deep learning algorithms, the best estimator for the task is chosen and then employed for both cross-domain evaluation and profiling of cyber-aggressive users. The results obtained with fivefold cross-validation are evaluated with an average-macro F1 score. These results show that BERT is the best estimator with an average macro F1 of 0.928, and employing it on various datasets collected from different OSN domains produces highly satisfying results. This article also reports detailed profiling of cyber-aggressive users by providing even more information than what is visible to the naked eye.Öğe Differentially private attribute selection for classification(Gazi Univ, Fac Engineering Architecture, 2018) Var, Esra; Inan, AliSelecting a relevant subset of attributes is one of the most important data preprocessing steps of data mining and machine learning solutions. For the classification task, selection is based on the correlation between an attribute and the class attribute. There are various studies on privacy preserving classification. However, there is no attribute selection solution for such work in the literature. In this study, novel attribute selection methods based on the state of the art solution in statistical database security, known as differential privacy, are proposed. The proposed solutions are implemented with the popular data mining library WEKA and experimental results confirm the positive effects of the proposed solutions on classification accuracy.Öğe Differentially private nearest neighbor classification(Springer, 2017) Gursoy, Mehmet Emre; Inan, Ali; Nergiz, Mehmet Ercan; Saygin, YucelInstance-based learning, and the k-nearest neighbors algorithm (k-NN) in particular, provide simple yet effective classification algorithms for data mining. Classifiers are often executed on sensitive information such as medical or personal data. Differential privacy has recently emerged as the accepted standard for privacy protection in sensitive data. However, straightforward applications of differential privacy to k-NN classification yield rather inaccurate results. Motivated by this, we develop algorithms to increase the accuracy of private instance-based classification. We first describe the radius neighbors classifier (r-N) and show that its accuracy under differential privacy can be greatly improved by a non-trivial sensitivity analysis. Then, for k-NN classification, we build algorithms that convert k-NN classifiers to r-N classifiers. We experimentally evaluate the accuracy of both classifiers using various datasets. Experiments show that our proposed classifiers significantly outperform baseline private classifiers (i.e., straightforward applications of differential privacy) and executing the classifiers on a dataset published using differential privacy. In addition, the accuracy of our proposed k-NN classifiers are at least comparable to, and in many cases better than, the other differentially private machine learning techniques.Öğe Energy Management in Organized Industrial Zones: Promoting the Green Energy Transition in Turkish Manufacturing Industry(IEEE, 2024) Ediger, Volkan S.; Kucuker, Mehmet Ali; Berk, Istemi; Inan, Ali; Uctug, Fehmi GorkemOrganized Industrial Zones (OIZ), which gained legal status by Law 4562 of 2000, played a significant role in Turkish industrialization policies, particularly in improving Small and Medium-sized Enterprises (SMEs). The energy management (EM) within OIZs is essential for Turkiye's green transition and 2053 net-zero pathway. Following the publication of a directive on OIZ's electricity market activities in 2006, enterprises can purchase electricity directly from OIZ management. Moreover, the Energy Efficiency Law No. 5627 of 2007 required OIZs to establish an energy management unit (EMU) to serve the participants with less than 1000 tons of oil equivalent (toe) energy consumption. EMUs provide OIZ management with a unique opportunity to enhance sustainable energy transition by increasing renewable energy production and improving the energy efficiency of participating enterprises. The primary goal of this research is to evaluate the effectiveness of energy management units in OIZs in encouraging energy efficiency and green energy transition in the Turkish manufacturing industry. As a case study, we examine EM in the Adana Haci Sabanci Organized Industrial Zone (Adana OIZ), which ranks third among OIZs regarding electricity consumption. We analyze data on electricity infrastructures, roof-top PVs, invoice settlements/offsets, energy efficiency investments, and GHG emissions between 2017 and 2023. Our preliminary findings suggest that EMU in the Adana OIZ makes a very important contribution to the green transition of industrial establishments and that regulatory changes over the last decades have had positive effects. The share of renewable energy in the total energy mix increased from 1.6% to 21.4% over six years, and there has been a noteworthy enhancement in energy efficiency, reaching 27% in 22 companies evaluated. The main policy implication of our findings is that the role of regulatory bodies and efficient energy management in OIZs will be critical in achieving Turkiye's net zero target of 2053.Öğe Explainable Profiling Attacks on Ethereum Blockchain Users Based on Volumetric and Temporal Behaviour(IEEE, 2024) Kılıç, Yasir; Inan, AliOne of many different application areas of the blockchain technology is crypto-currencies. Products like Bitcoin and Solana provide financial services that are unmediated, distributed and anonymous. Among various blockchains, Ethereum stands out due to its support of smart contracts. However, softly authenticated transactions occuring on such platforms facilitate crimes like money laundering and sales of illegal items/services. Denanonymization, over blockchains, refers to identifying distinct accounts of the same person and is used for tracking illegal trafficking of cryptocurrencies. In this study, our purpose is to increase the rate of success of deanonymization and to support explainable approaches. Towards this aim, we imitate blockchain analysts and propose 19 novel heuristic features that are volumetric and temporal. Empirical experiments indicate that temporal features increase the attack success rate by 39%. Shapley values adapted from the cooperative game theory field support this finding.Öğe Explode: An Extensible Platform for Differentially Private Data Analysis(IEEE, 2016) Esmerdag, Emir; Gursoy, Mehmet Emre; Inan, Ali; Saygin, YucelDifferential privacy (DP) has emerged as a popular standard for privacy protection and received great attention from the research community. However, practitioners often find DP cumbersome to implement, since it requires additional protocols (e.g., for randomized response, noise addition) and changes to existing database systems. To avoid these issues we introduce Explode, a platform for differentially private data analysis. The power of Explode comes from its ease of deployment and use: The data owner can install Explode on top of an SQL server, without modifying any existing components. Explode then hosts a web application that allows users to conveniently perform many popular data analysis tasks through a graphical user interface, e.g., issuing statistical queries, classification, correlation analysis. Explode automatically converts these tasks to collections of SQL queries, and uses the techniques in [3] to determine the right amount of noise that should be added to satisfy DP while producing high utility outputs. This paper describes the current implementation of Explode, together with potential improvements and extensions.Öğe Facebook Tells Me Your Gender: An Exploratory Study of Gender Prediction for Turkish Facebook Users(Assoc Computing Machinery, 2021) Coban, Onder; Inan, Ali; Ozel, Selma AyseOnline Social Networks (OSNs) are very popular platforms for social interaction. Data posted publicly over OSNs pose various threats against the individual privacy of OSN users. Adversaries can try to predict private attribute values, such as gender, as well as links/connections. Quantifying an adversary's capacity in inferring the gender of an OSN user is an important first step towards privacy protection. Numerous studies have been made on the problem of predicting the gender of an author/user, especially in the context of the English language. Conversely, studies in this field are quite limited for the Turkish language and specifically in the domain of OSNs. Previous studies for gender prediction of Turkish OSN users have mostly been performed by using the content of tweets and Facebook comments. In this article, we propose using various features, not just user comments, for the gender prediction problem over the Facebook OSN. Unlike existing studies, we exploited features extracted from profile, wall content, and network structure, as well as wall interactions of the user. Therefore, our study differs from the existing work in the broadness of the features considered, machine learning and deep learning methods applied, and the size of the OSN dataset used in the experimental evaluation. Our results indicate that basic profile information provides better results; moreover, using this information together with wall interactions improves prediction quality. We measured the best accuracy value as 0.982, which was obtained by combining profile data and wall interactions of Turkish OSN users. In the wall interactions model, we introduced 34 different features that provide better results than the existing content-based studies for Turkish.Öğe Fine-grained Kinship Detection for Facebook Users based on Wall Contents(Institute of Electrical and Electronics Engineers Inc., 2021) Ooban, Onder; Inan, Ali; Ozel, Selma AyseThis paper investigates whether it is possible to automatically detect fine-grained kinship (not to detect its existence but to detect the type of kinship like child, father, grandfather, and so on) between two Facebook users or not. To do so, we present and employ a lexicon-based approach that completely depends on the wall contents of users. To the best of our knowledge, this is the first study towards kinship detection for both the type of input data (i.e., free OSN text) and the language (i.e., Turkish). We perform our experiments on a crawled snapshot of public Facebook data collected from accounts of users in Turkey. Our results are promising and show that a content-based approach can be a good starting point for future works even though it has some challenges. © 2021 IEEE.Öğe Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy(Springer International Publishing Ag, 2017) Tong, Xiaosu; Xi, Bowei; Kantarcioglu, Murat; Inan, AliMany statistical models are constructed using very basic statistics: mean vectors, variances, and covariances. Gaussian mixture models are such models. When a data set contains sensitive information and cannot be directly released to users, such models can be easily constructed based on noise added query responses. The models nonetheless provide preliminary results to users. Although the queried basic statistics meet the differential privacy guarantee, the complex models constructed using these statistics may not meet the differential privacy guarantee. However it is up to the users to decide how to query a database and how to further utilize the queried results. In this article, our goal is to understand the impact of differential privacy mechanism on Gaussian mixture models. Our approach involves querying basic statistics from a database under differential privacy protection, and using the noise added responses to build classifier and perform hypothesis tests. We discover that adding Laplace noises may have a non-negligible effect on model outputs. For example variance-covariance matrix after noise addition is no longer positive definite. We propose a heuristic algorithm to repair the noise added variance-covariance matrix. We then examine the classification error using the noise added responses, through experiments with both simulated data and real life data, and demonstrate under which conditions the impact of the added noises can be reduced. We compute the exact type I and type II errors under differential privacy for one sample z test, one sample t test, and two sample t test with equal variances. We then show under which condition a hypothesis test returns reliable result given differentially private means, variances and covariances.Öğe Graph-based modelling of query sets for differential privacy(Assoc Computing Machinery, 2016) Inan, Ali; Gursoy, Mehmet Emre; Esmerdag, Emir; Saygin, YucelDifferential privacy has gained attention from the community as the mechanism for privacy protection. Significant effort has focused on its application to data analysis, where statistical queries are submitted in batch and answers to these queries are perturbed with noise. The magnitude of this noise depends on the privacy parameter s and the sensitivity of the query set. However, computing the sensitivity is known to be NP-hard. In this study, we propose a method that approximates the sensitivity of a query set. Our solution builds a query-region-intersection graph. We prove that computing the maximum clique size of this graph is equivalent to bounding the sensitivity from above. Our bounds, to the best of our knowledge, are the tightest known in the literature. Our solution currently supports a limited but expressive subset of SQL queries (i.e., range queries), and almost all popular aggregate functions directly (except AVERAGE). Experimental results show the efficiency of our approach: even for large query sets (e.g., more than 2K queries over 5 attributes), by utilizing a state-of-the-art solution for the maximum clique problem, we can approximate sensitivity in under a minute.Öğe Industrial Fault Detection and Classification with the Optimal Windows Size Approach(IEEE, 2024) Ayana, Omer; Inan, AliVarious important issues in industrial production processes such as product quality, process safety and supply continuity are diretly related to machine faults that occur in production and distribution stages. In addition to economic losses, machine faults also result in industrial accidents. Early diagnosis of possible faults would cut down possible losses. To date, various solutions on fault detection has been proposed. Existing solutions either detect faults after they occur or misdiagnose them due to complexity caused by operating over multiple measurements. In this study, to the best our knowledge, we propose a supervised model that optimally determines the window size for both fault detection and classification problems. Additionally, in order to determine the features that are more heavily related with the problem, we apply the binary version (BCS) of the nature-inspired Cuckoo Search Algorithm (CSA) for feature selection. Our results indicate that determining the window size appropriately has a significant impact on accuracy and feature selection increases the F-score roughly around 13%.Öğe Inverse document frequency-based sensitivity scoring for privacy analysis(Springer London Ltd, 2022) Coban, Onder; Inan, Ali; Ozel, Selma AysePrivacy risk analysis of online social network (OSN) users aims at generating a risk score for each OSN user such that higher scores potentially imply a greater risk of privacy violation. Privacy risk analysis is typically carried out over a response matrix (R) where any matrix element r(ij) indicates the portion of the OSN that the user i shares his/her attribute j. Most of the existing work relies on the mathematical framework of item response theory to derive sensitivity and visibility components from R. In this study, we propose interpreting R to be a term-document matrix and consequently suggest using the inverse document frequency (IDF) method as the sensitivity component. Experiments performed on both synthetic and real-world datasets show that the proposed IDF-based method can be used as a sensitivity component.Öğe Named Entity Recognition over FBNER: A New Facebook Dataset in Turkish(Institute of Electrical and Electronics Engineers Inc., 2021) Ooban, Onder; Ozel, Selma Ayse; Inan, AliIn this paper, we introduce a new Named Entity Recognition (NER) dataset of Facebook messages written in the Turkish language. We also employ a Conditional Random Fields based NER system to discover named entities from Facebook messages. Our system achieves an F1 score of 0.713 when training and test sets include Facebook posts. We also obtained an F1 score of 0.599 when the training set is from the news domain. A strength of this research is that it is one of the first studies in this field that focuses on NER over Turkish Facebook messages. This is because performing NER on user-generated content turns into a very challenging task since such informal contents are often noisy texts that have arammatical and spelling errors. © 2021 IEEE.Öğe Privacy Risk Analysis for Facebook Users(IEEE, 2020) Coban, Onder; Inan, Ali; Ozel, Selma AyseDisclosing personal information over Online Social Networks (OSNs) poses security and privacy risks. The privacy risk imposed on an OSN account can be measured through the privacy preferences and the network position of the corresponding account. In this study, two state-of-the-art methods are used to measure privacy risk of Turkish users of the Facebook OSN. Experimental results show that male users and users at age range 21-40 are at greater risk.Öğe Privacy Scoring over OSNs: Shared Data Granularity as a Latent Dimension(Assoc Computing Machinery, 2023) Kılıç, Yasir; Inan, AliPrivacy scoring aims at measuring the privacy violation risk of a user over an online social network (OSN) based on attribute values shared in the user's OSN profile page and the user's position in the network. Existing studies on privacy scoring rely on possibly biased or emotional survey data. In this study, we work with real-world data collected from the professional LinkedIn OSN and show that probabilistic scoring models derived from the item response theory fit real-world data better than naive approaches. We also introduce the granularity of the data an OSN user shares on her profile as a latent dimension of the OSN privacy scoring problem. Incorporating data granularity into our model, we build the most comprehensive solution to the OSN privacy scoring problem. Extensive experimental evaluation of various scoring models indicates the effectiveness of the proposed solution.Öğe Privacy-Preserving Learning Analytics: Challenges and Techniques(IEEE Computer Soc, 2017) Gursoy, Mehmet Emre; Inan, Ali; Nergiz, Mehmet Ercan; Saygin, YucelEducational data contains valuable information that can be harvested through learning analytics to provide new insights for a better education system. However, sharing or analysis of this data introduce privacy risks for the data subjects, mostly students. Existing work in the learning analytics literature identifies the need for privacy and pose interesting research directions, but fails to apply state of the art privacy protection methods with quantifiable and mathematically rigorous privacy guarantees. This work aims to employ and evaluate such methods on learning analytics by approaching the problem from two perspectives: (1) the data is anonymized and then shared with a learning analytics expert, and (2) the learning analytics expert is given a privacy-preserving interface that governs her access to the data. We develop proof-of-concept implementations of privacy preserving learning analytics tasks using both perspectives and run them on real and synthetic datasets. We also present an experimental study on the trade-off between individuals' privacy and the accuracy of the learning analytics tasks.Öğe Sensitivity Analysis for Non-Interactive Differential Privacy: Bounds and Efficient Algorithms(IEEE Computer Soc, 2020) Inan, Ali; Gursoy, Mehmet Emre; Saygin, YucelDifferential privacy (DP) has gained significant attention lately as the state of the art in privacy protection. It achieves privacy by adding noise to query answers. We study the problem of privately and accurately answering a set of statistical range queries in batch mode (i.e., under non-interactive DP). The noise magnitude in DP depends directly on the sensitivity of a query set, and calculating sensitivity was proven to be NP-hard. Therefore, efficiently bounding the sensitivity of a given query set is still an open research problem. In this work, we propose upper bounds on sensitivity that are tighter than those in previous work. We also propose a formulation to exactly calculate sensitivity for a set of COUNT queries. However, it is impractical to implement these bounds without sophisticated methods. We therefore introduce methods that build a graph model G based on a query set Q, such that implementing the aforementioned bounds can be achieved by solving two well-known clique problems on G. We make use of the literature in solving these clique problems to realize our bounds efficiently. Experimental results show that for query sets with a few hundred queries, it takes only a few seconds to obtain results.