Abstract:In the financial market, gold futures prices are influenced by a variety of factors, and accurate prediction of these prices holds significant importance. To address this issue, a new model for predicting gold futures prices has been proposed that integrates multiple data sources, combining a LightGBM(Light Gradient Boosting Machine)feature selection method with an LSTM model. Firstly, the paper preprocesses the acquired macroeconomic indicators and technical indicators. It then annotates the sentiment tendencies of unstructured news headlines using various methods, leading to the construction of a weighted sentiment index. Additionally, it merges the search indices of multiple keywords from Baidu into a comprehensive Baidu search index. Secondly, the LightGBM method is used to rank the importance of features from macroeconomic and technical indicators, extracting key features. Finally, the selected features, along with the weighted sentiment index and the comprehensive Baidu search index, serve as input variables for the LSTM forecasting model. Empirical results show that the LightGBM-LSTM model with multi-source data has excellent prediction performance and the model prediction error is the smallest. Compared with the benchmark model, it can make a more accurate prediction of the closing price of gold futures.