Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GWS Auto-Tuner #8

Open
jtesta opened this issue Jun 18, 2019 · 0 comments
Open

Add GWS Auto-Tuner #8

jtesta opened this issue Jun 18, 2019 · 0 comments

Comments

@jtesta
Copy link
Owner

jtesta commented Jun 18, 2019

The global work size (GWS) parameter in OpenCL is used to tell a device how many pieces of work to do at a time. Tuning this parameter can result in big improvements in throughput (sometimes over 50%).

Currently, the optimal GWS for each GPU model is determined through manual experimentation and put into gws.c. This method does not scale well, as it leaves out many popular hardware models. A much better method is to add an auto-tuner that determines the optimal setting at run-time.

A proposed solution is this: each time the generation or lookup code is run, it will check if an optimal setting is already known from a previous invokation. This will be done with the following values as a unique key in a hash table: table parameters, device name, driver version (note that the table parameters have been noted to make a difference in optimal GWS; furthermore, driver improvements can make a difference as well). If an optimal setting is already known, it is used; otherwise, variations of the GWS will be tested until an optimal value is found.

The manual GWS command line argument ("-gws") must be preserved in case the user wishes to override this setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant