Abstract:To improve the accuracy of predicting tar yield in cigarettes, several machine learning methods and the ordinary liner regression were used to predict tar yield. The standardized mean square error was set as the criterion to judge the model’s predicting accuracy. The results indicated that significant differences among individual regression models were observed. The machine learning methods showed a higher accuracy of predicting tar yield than that of the traditional simple liner regression. Random forest regression performed the best for predicting tar yield in these models and its performance was stable and precise. The second best model should be the support vector machine regression. Thus, machine learning methods could be widely applied in predicting tar yield and other tobacco research areas.