-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of cuDF and HoloViews for GPU acceleration #485
base: master
Are you sure you want to change the base?
Conversation
BTW this addresses #478 and shows really good performance potential in either adding HoloViews option OR cuDF option. |
Thanks for the contribution @AdityaR-Bits! The cuDF and Holoviews backends is certainly a valuable addition to Lux. |
@dorisjlee, @AdityaR-Bits's project has shown that there is value in modularizing lux into components that separate the data structure, interestingness calculations, and visualization frameworks. Are there any plans (or resources) to do so? |
Hi @exactlyallan, Definitely agree that it would be a good refactoring change. We have experimented with separating out the LuxDataFrame data structure from the visualization and recommendation modules in a separate branch, but the work has not yet been merged into the main branch yet. |
Overview
This implementation of LUX adds the option of utilizing NVIDIA GPUs, with RAPIDS cuDF and HoloViews as the plotting engine. It is capable of a speed up 3-10X compared to the original LUX, and avoids browser memory issues when dealing with datasets in the millions+ rows (measured on the NVIDIA RTX A3000 Laptop GPU).
HoloViews
HoloViews does not require the creation of a JSON file which for larger datasets is both memory and time expensive. It is able to show magnitudes of more data points on its curve without being time consuming, also removing the constraint of having to fall back to heatmaps rather than scatter plots, when the number of rows is too high. In this implementation we have not relied on the LUX widget for displaying the charts, for simplicity in viewing.
To Run
To run the cuDF + HoloViews implementation, simply do the following
To plot the HoloViews curves, run
df.maintain_recs()
rather thandf
in a different cell.Example Output
A brief output is shown below
Next Steps
This implementation is a proof of concept demonstrating the acceleration that RAPIDS can bring to LUX. It also shows the benefits of adding HoloViews as an additional option for plotting. @exactlyallan, @AjayThorve and I would like to discuss if and how an integration like this might proceed, @dorisjlee?