Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added DataLoader with AlphaVantage integration and added Requirements.txt, #4

Closed
wants to merge 3 commits into from

Conversation

marvin-hansen
Copy link

@marvin-hansen marvin-hansen commented May 13, 2019

Following issue #3, this PR implements most of the requests. Specifically,

  1. Adds central DataLoader that allows fetching data either from a local file or online from AlphaVantage. The latter allows fetching Crypto, Forex, and Equity data down to minute tick resolution.

  2. Adds requirements.txt. Most modern IDE's install missing requirements. This also fixes the missing mpl_finance package.

  3. Sample code in main function

The underlying rationale for the central DataLoader roots in the idea to prepare three open questions:

A-1) When adding technical indicators, does the agent trades better or worse?

With the central DataLoader, one just pulls the matching indicator, mergers with the dataset over the shared index and run the experiment again. Specifically, the question is whether common trading practices such as price distance from BBand or distance from SMA20/200 help the agent to trade better.
The required data pre-processors ("procs") can easily be added on top of new utils.

A-2) Does transfer learning works as good in RL as it does in image classification?

For any given image classification problem, the usual way is to download a pre-trained model,
fits it a bit, and gets pretty damn good results. Can we do the same thing with RL trading agents?
For example, train an agent on the entire FANG stocks, save the agent (pickle?), and then just download the stored agent to run on another stock, say ADBE, does a bit of fitting and see how that works?

A-3)
How would delta-indictors and transfer learning impact gradient-free neuro evolution RL [1] and evolution strategy with Bayesian optimization [2] and especially when modifying the fitness (ranking) function to use the F1 score[3] w.r.t. discerning True/False Buy/Sells?

[1]
https://towardsdatascience.com/reinforcement-learning-without-gradients-evolving-agents-using-genetic-algorithms-8685817d84f
https://github.com/paraschopra/deepneuroevolution/blob/master/openai-gym-cartpole-neuroevolution.ipynb

[2]
https://github.com/huseinzol05/Stock-Prediction-Models/blob/master/free-agent/evolution-strategy-bayesian-agent.ipynb

Note, the Bayesian optimization in the trade agent above is pretty slow at best so I suggest replacing it with optuna to speed things up while ensuring parallel optimization and GPU acceleration of the model.
https://optuna.org/#key_features
https://github.com/pfnet/optuna/blob/master/examples/quadratic_simple.py
https://github.com/pfnet/optuna/blob/master/examples/pytorch_simple.py
https://github.com/pfnet/optuna/blob/master/examples/tensorflow_estimator_simple.py

[3] The F1 score measures how many of the Buy signals actually are buy signals and divides the score by actual sales signals. Details in Chapter 1, Advances in Financial Machine Learning
https://www.amazon.com/Advances-Financial-Machine-Learning-Marcos/dp/1119482089

@marvin-hansen
Copy link
Author

Added reference to optuna optimzer

@notadamking
Copy link
Owner

Hey Marvin, I really appreciate your contributions and I can tell you've done your research!I have quickly reviewed your submitted code, and it looks high quality, so I'd be happy to integrate it into the current repo.

However, I am currently writing the followup article to the previous one, and it includes quite a bit of optimizations on the current code, including implementing optuna and calculating technical indicators using the ta library.

I would gladly implement your data loader into my code for the article and give you a shout out if you'd like. Otherwise, the article will be published some time this week, and when it is, I will update this repo and get your PR merged.

@notadamking
Copy link
Owner

After looking a bit deeper into your code, I've realized we will need to update the DataLoader to support multiple data set integrations, rather than just AlphaVantage. For example, it would be nice to switch between data sets to compare them, as well as allow us to support future APIs besides AlphaVantage.

@marvin-hansen
Copy link
Author

marvin-hansen commented May 14, 2019

@notadamking Thank you Adam,

in a nutshell, how about suspending this PR and go directly to add the stuff to your new repo?

The code is a stripped down version of my StockUtils. However, I have phased out TA-Lib a few months ago, replaced it with a TI-loader, and never looked back.

https://github.com/marvin-hansen/StockUtils

The full utils contain:

  1. Quandl & AlphaVantage integration (As data providers)
  2. A managed cache (to mitigate the API download limit)
  3. ProcFlow

ProcFlow isn't pushed to the public repo yet, but it actually is a mechanism of daisy-chaining data pre-processing workflows (hence ProcFlow) in a way that allows rapid experimentation. Effectively, each ProcFlow describes a "FeatureSet" for a given ticker and adds a unique ID to it, so you can load FeatureSet-23 and compare it to FeatureSet-14. I wrote it for a previous project and it enabled me to run about 15 - 25 complete ML experiments per day to figure out which feature set impacts the accuracy of FC net the most. A few of the key insights are documented here:
samre12/deep-trading-agent#7 (comment)

To your point, yes you can integrate the StockUtils in any repo or project you want.

Alternatively, if you send me an invite to your next article repo, I can do the code integration and send you a PR with some comments & test cases.

Either way, when you integrate code from my repos and actually use it in your article, please mention me as a contributor in the resulting article.

@marvin-hansen
Copy link
Author

DataLoader has been updated with an interface, an implementation, and a stub for Quandl.
https://github.com/marvin-hansen/StockUtils/tree/master/src/utils

ProcFlow with some documentation is now in the repo:
https://github.com/marvin-hansen/StockUtils/blob/master/src/procs/ProcFlow.py

@notadamking
Copy link
Owner

After further inspection, it looks like this PR would add unnecessary functionality (e.g. alphavantage bindings) and is not generic enough for multiple data sets. I will have to rethink the DataLoader and come up with a more generic solution.

@notadamking notadamking closed this Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants