Return forecasting plays a crucial role in identifying stocks with profit potential, making it a valuable tool for constructing investment portfolios. In previous studies, financial news, which covers events and announcements related to companies and the broader economy, has demonstrated significant predictive power for future stock performance.
The conventional way of applying financial news data to stock picking involves a multi-step extraction-and-validation process as illustrated in Fig. 1(a), i.e., formulating the numerical features (e.g., sentiments, topics, etc.) with the expectation that these features have a predictive relationship with stock performance (e.g., forward return, volatility, etc.), developing the calculation processes to extract features (e.g., train a sentiment classification model), and validating the predictive power of extracted features by statistical analysis or building forecasting models. This process might be time-consuming and require additional data and continuous refinements.
Large Language Models (LLMs) generate numerical representations (or embeddings) of text that effectively capture semantic relationships. These embeddings can naturally be utilised as features for forecasting tasks. Building on this intuition, this paper investigates the potential of direct news-to-return prediction by fine-tuning LLMs.