Topi, ArdianaKasaj, AgimHudhra, DanielKelebek, HasimGuclu, GamzeSelli, SerkanTopi, Dritan2026-02-272026-02-2720252673-453210.3390/analytica6040043http://dx.doi.org/10.3390/analytica6040043https://hdl.handle.net/20.500.14669/4633Wine phenolics serve as robust chemical signatures correlated to grape variety, processing, and regional identity. This study explores the potential of machine learning algorithms, combined with the phenolic profiles of Albanian wines, to classify them according to grape variety. Geographic origin analysis was conducted as a preliminary exploration. The dataset of phenolic compounds included white and red wines, spanning the 2017 to 2021 vintages. Using five supervised algorithms-Support Vector Machine (SVM), Random Forest, XGBoost, Logistic Regression, and K-Nearest Neighbors-a high classification accuracy was achieved, with SVM reaching 100% under Leave-One-Out Cross-Validation (LOOCV). To address class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) and stratified cross-validation were applied. Random Forest feature importance consistently highlighted trans-Fertaric acid and Procyanidin B3 as dominant discriminants. Parallel coordinates plots demonstrated clear varietal patterns driven by phenolic differences, while PCA and hierarchical clustering confirmed unsupervised grouping consistent with wine type and maceration level. Permutation testing (1000 iterations) confirmed the non-randomness of model performance. These findings show that a small set of phenolic markers can offer high classification accuracy, supporting chemically based wine authentication. Although the dataset is relatively small, thorough cross-validation, non-redundant modeling, and chemical interpretability provide a solid foundation for scalable methods. Future work will expand the dataset and explore sensor-based phenolic measurement to enable rapid authentication in wine.eninfo:eu-repo/semantics/openAccesswine phenolicsmachine learningwine authenticityAlbanian grape varietiesLOOCVPCArandom forestSVM classificationMachine Learning-Based Classification of Albanian Wines by Grape Variety, Using Phenolic Compound DatasetArticle46WOS:001645953700001