Predicting Bitcoin: a robust model for predicting Bitcoin price directions based on network influencers
MetadataShow full item record
The ability to predict financial markets has tremendous potential to limit exposure to risk and provide better assurances of annualized gains. In this thesis, a model for predicting the future daily price of Bitcoin is proposed and evaluated in comparison to that of a purely random model. Bitcoin is a novel digital currency that relies on cryptography instead of a central authority to verify transactions. Without a central authority, Bitcoin requires a complete list of all transactions to be made public so that they can be verified by all users. This unique feature of Bitcoin, where all transactions are public, is exploited by the model to predict the future price directions based on the actions of Bitcoin users. The daily activity of the markets, aggregate network features, and the actions of major network influencers are all used as features for the predictive model. Where major network influencers are defined as users that accumulate a disproportionate amount of wealth within the Bitcoin network compared to others. The information about the actions of all Bitcoin users are extracted from the blockchain and stored in a relational database for ease of use. Two metrics were created to identify the major network influencers based on the history of their actions recorded on the blockchain. The first metric, based on the concept of an h-index, often used in academia to rank authors by their citations, is used to rank users by the amount of wealth they accumulate monthly. The second metric is based on the optimization of multiple objectives, the maximum increase in wealth with the least amount of activity using Pareto optimization. All of the major network influencers identified were then used as features, in combination with aggregate network features, and market data, to test and evaluate several predictive models. The models created were based on non-linear equations, support vector machines, decision trees, and XGBoost; all evaluated and compared using the same data. The XGBoost model consistently proved to be much more accurate than all other models and was used for the final set of experiments. The XGBoost model was compared to that of a purely random Monte Carlo model using the entire history of data for the period of 2013–2016. The first set of experiments were conducted using various sizes of training and testing data, in each case the XGBoost model had an accuracy 20% greater than that of the Monte Carlo model. For the final experiments, the model was tested in a realistic scenario, predicting the price direction for each future day, while also being re-trained using the results of each new day. The XGBoost model achieved a much better performance in comparison to the Monte Carlo model, which had approximately 50% accuracy, whereas the XGBoost model had 70%–79% accuracy.