-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add cache prefix #358
Comments
Hello @ernado |
Hi 👋 Are you open for a contribution for this issue? We're experiencing it as well and using the suggested workaround (using |
Opened #366, feedback appreciated |
This could help to workaround the problem we have on our project. |
I've recently noticed a fair bit of slow-down in our parallel builds from 2:30 to 3:30, see https://github.com/evcc-io/evcc/actions/runs/5124962244 for an example. It shows that both build and test steps- despite stable dependencies- will download a fair bit of packages that should already be there:
It seems likely that this might be a conflict with the parallel
It seems that the situation described in this issue might be quite common. |
And here is the result applying this issue's PR: https://github.com/evcc-io/evcc/actions/runs/5126503582. Build time reduced from 3min to 1min at the cost of 3 separate caches. Worth the efforts. Any chance to get this reviewed/merged? |
Hello everyone. We would not like to implement such kind of logic, because the initial caching logic for setup-go should cover the most often use cases. If you need more flexibility with cache, we would like to recommend actions/cache. It's relevant to setup-go ADR. For now I'm going to close the issue. |
Hi @dmitry-shibanov and thanks for the comment. Are you open to adding docs for the cases the caching logic in setup-go doesn't cover? It does seem like a gotcha that should be documented (based on the backlinks to this issue) as the caching is now enabled by default. Something in the lines of:
|
Agree with @erezrokah, adding this to docs (and probably providing a reference workaround + link this issue) would be very helpful. Also agree with @dmitry-shibanov that this feature should not be implemented, because it is hard to cover all use-cases, so just let user to handle caches manually (like we did in our projects). |
Plus this needs to cover which folders to cache for Go best practices.
This is really unfortunate. The additional logic is minimal and the use is large. It seems quite wide-spread to have build, test, lint jobs parallellized and in all these instances you'll need to separate cache keys.
I disagree. The case covered is in the users hands by chosing the right key. |
And also explain to use the Go version from |
@erezrokah would you potentially consider rebasing your PR and keeping your fork going forward? |
I'm not sure that would the best solution. If there are docs that users can copy paste and get the same functionality then my PR would be mostly nice to have I think, and won't justify maintaining a fork |
@dsame thank you for reopening as investigation. Is there anything we could contribute? |
I removed
I'm not sure how you arrived at that "~90% of basic use-cases" but it seems unlikely to me that 90% of projects (if that's what it means) don't have concurrent jobs running in a pipeline. And since caching now is enabled by default it means it won't work by default. I don't think we're asking for fine-tuning of the caching mechanism here, but just for a way to make the caching work in even more cases. Whatever that 10% is, still doesn't seem an irrelevant number especially when it can be fixed by a 2-line PR |
The issue relates to the feature request for optional build directory caching https://github.com/actions/setup-actions-team/issues/26 The advantage of (optional) skipping build directory at all seems to be more preferable than having key-prefix because it allows to have multiple build steps in the one job instead of multiple jobs installing the same go version just for sake of getting different caches having the almost same content due to the common dependencies. Although there might be a case where caching build directory brings significant improving over skipping it. At the moment there's no any discovered, but it would be useful know different opinions to make a final decision Adding the cache prefix brings significant drawback of using log of storage with multiple caches and hitting the limit very soon, making the caching impossible at all. Thanks in advance for sharing your thoughts. |
Seems I can't follow the link. Would like to read up on build directory caching to understand the proposal better. |
We want to be able to cache test results as in some cases clean testing would run for ~10 minutes, whilst the cached results will be available in seconds. |
I'm still failing to understand the proposal of "optional build directory caching". The link in #358 (comment) does not work.
We're also caching:
We also have different build steps like generating assets, updating docs and checking repo is in "clean" state that are all parallelised. Some of these steps use Go-ased tools but would not necessarily download modules or fully populate the build cache. Using any of these steps for heavier tasks like build or test has ended up with cache hits on half-populated caches. |
Thanks for sharing @dsame ❤️ Can you maybe share a before/after workflow example with the optional build step so we can understand the proposal better? Won't that mean that the different steps run in serial as opposed to parallel build jobs? A cache prefix tries to solve using Please note that at least in our case we also have completely different workflows (not only jobs) that run in parallel, and each workflow needs a distinct cache (for example, one used for testing invoked on pull request triggers, and another used for releasing artifacts invoked on tag push triggers).
In our case we treat managing cache limits as a distinct problem to solve (especially as we have a monorepo with many components, each one with a different cache), unrelated to the requirement to have separate caches for jobs. There are official guides (see https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries) on how to manage cache limits, and it might be better to leave the decision on how to manage cache limits to users. Regardless of cache limits and this issue, I would prefer to have a cache miss than a cache hit on a cache created from an unrelated job, as at least a cache miss will create consistent results. |
@andig the complex cases like the one you have described should use @actions/actions-cache , see the samples as starting point |
@dsame after looking at the samples I still see this an ill-advised tradeoff. We require people to think about cache keys, restore keys, pathes, hashfiles etc instead of providing them the absolutely simple and straight-forward capability of separating their caches per prefix which is a single-line PR. More so, with any change in actions/cache or the way how Go dependencies should be cached, we would require everyone to update their manual cache settings once more. The fact that even documenting the advanced cases has been ongoing for six months now indicates that this is not straight-forward to get right. I can only suggest to rethink this decision. I do not object to adding additional docs for advanced use cases, but it would be super handy if usage for semi-simple cases could be simplified. Update Let me add one example. a4d10f0 added the OS to the cache key. Everyone using actions/cache would need to follow this best practice manually. It should also make the docs for the advanced use cases. Adding the prefix would implement this for everyone at zero cost. |
Friendly ping @dsame would appreciate a comment if these concerns are valid? |
I think it would be great to have a little more control over the setup-go cache options. I was surprised to see that several of my branches share the same cache, for example. |
I have similar issue as @jsirianni - different branches share the same cache. When my feature branch starts using first external module, From my (user of setup-go) perspective, cache integration covers only one case: trunk-based development with sequential (non-parallel) jobs. But it isn't clear from the beginning. GitHub encourages working on feature branches. Even in this repo nearly all pull requests come from feature branches. So I share @lucacome opinion, that current defaults may be unsuitable for far more than 10% users of setup-go. |
For those that don't want to deal with the unnecessary complexity of job_a:
steps:
- uses: actions/setup-go@v5
with:
cache-dependency-path: |
"${{ env.WORKING_DIR }}/go.sum"
.github/.cache/buster-for-job-a
job_b:
steps:
- uses: actions/setup-go@v5
with:
cache-dependency-path: |
"${{ env.WORKING_DIR }}/go.sum"
.github/.cache/buster-for-job-b Notice the additional files in Another "feature" that this workaround provides is that you can force a rebuild of the job's cache by modifying its respective "cache buster" file. |
Worth mentioning that #416 is still open almost 6 months later. Splitting caches by build remains a complex task. |
Due to how you cannot pass in a cache key to actions/go-setup, we cannot reliably use the cache produced by this action. See issue for more info: actions/setup-go#358
Due to how you cannot pass in a cache key to actions/go-setup, we cannot reliably use the cache produced by this action. See issue for more info: actions/setup-go#358
Due to how you cannot pass in a cache key to actions/go-setup, we cannot reliably use the cache produced by this action. See issue for more info: actions/setup-go#358
<!-- markdownlint-disable MD041 --> #### What this PR does / why we need it actions/setup-go#358 #### Which issue(s) this PR fixes <!-- Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`. --> Allows atomic build caches per job that actually dont overwrite each other Co-authored-by: Hilmar Falkenberg <[email protected]>
<!-- markdownlint-disable MD041 --> #### What this PR does / why we need it actions/setup-go#358 #### Which issue(s) this PR fixes <!-- Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`. --> Allows atomic build caches per job that actually dont overwrite each other Co-authored-by: Hilmar Falkenberg <[email protected]>
Description:
Add cache key prefix, like that:
Justification:
Cache from multiple workflows can be mixed.
For example, there can be a workflow that only downloads dependencies.
This will produce cache without
.cache/go-build
, which will be used byanother workflows, effectively disabling build cache.
This was the cause in #357
Are you willing to submit a PR?
I don't know typescript well, but I'm learning.
Workaround
Use
actions/cache@v3
manually, something like that:Cons
Cache configuration options can possibly lead to feature bloat, so feel free to decline this feature proposal.
The text was updated successfully, but these errors were encountered: