Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of artifacts with with_sbom_overview=true is slow #21281

Open
strigazi opened this issue Dec 4, 2024 · 4 comments
Open

List of artifacts with with_sbom_overview=true is slow #21281

strigazi opened this issue Dec 4, 2024 · 4 comments

Comments

@strigazi
Copy link

strigazi commented Dec 4, 2024

After #18610 it looks like queries for artifacts with with_scan_overview=true are very fast.

However when with_sbom_overview=true is used in repositories with multi-arch artifacts (with or without with_scan_overview=true), the API call is really slow.

This for a repository with both linux/arm64 and linux/amd64 artifacts and the oci manifests

  • scan=true & sbom=false <1s
  • scan=true & sbom=true >20s

These are for a repository with the same tags but only linux/amd64

  • scan=true & sbom=false <1s
  • scan=true & sbom=true <1s

I get this numbers with the curl or in the UI for the following url:
https://example.com/api/v2.0/projects/<a project>/repositories/<a repo>/artifacts?with_tag=false&with_scan_overview=true&with_sbom_overview=true&with_label=false&with_accessory=false&page_size=50&page=1

The testing repository is for:
image: rancher/hyperkube
tags: 17

  • harbor version: 2.12.0
@strigazi strigazi changed the title List of arfifacts with with_sbom_overview=true is slow List of artifacts with with_sbom_overview=true is slow Dec 5, 2024
@stonezdj stonezdj self-assigned this Dec 9, 2024
@Vad1mo
Copy link
Member

Vad1mo commented Dec 9, 2024

@strigazi can you try to reproduce this on our demo environment demo.goharbor.io and pose the results here?

@strigazi
Copy link
Author

strigazi commented Dec 9, 2024

@Vad1mo This demo deployment performs clearly better, but a difference in response time can still be observed.

linux/amd64 [0]

  • scan=true & sbom=false <0.5s
  • scan=true & sbom=true <0.5s

multi platform [1]

  • scan=true & sbom=false <4s
  • scan=true & sbom=true <4s

[0] time curl 'https://demo.goharbor.io/api/v2.0/projects/strigazi/repositories/busybox-amd64%252Fbusybox/artifacts?with_tag=false&with_scan_overview=true&with_sbom_overview=true&with_label=false&with_accessory=false&page_size=50&page=1' -s -o /dev/null
[1] time curl 'https://demo.goharbor.io/api/v2.0/projects/strigazi/repositories/busybox-multi%252Fbusybox/artifacts?with_tag=false&with_scan_overview=true&with_sbom_overview=true&with_label=false&with_accessory=false&page_size=50&page=1' -s -o /dev/null

I ceated the testing environment like below:

  • demo.goharbor.io/strigazi/busybox-amd64/busybox
    for i in 1.30.0 1.30.1 1.31.0 1.31.1 1.32.0 1.32.1 1.33.0 1.33.1 1.34.0 1.34.1 1.35.0 1.36.0 1.36.1 1.37.0; do crane copy **--platform linux/amd64** docker.io/library/busybox:$i demo.goharbor.io/strigazi/busybox-amd64/busybox:$i ; done

  • demo.goharbor.io/strigazi/busybox-multi/busybox:
    for i in 1.30.0 1.30.1 1.31.0 1.31.1 1.32.0 1.32.1 1.33.0 1.33.1 1.34.0 1.34.1 1.35.0 1.36.0 1.36.1 1.37.0; do crane copy --platform all docker.io/library/busybox:$i demo.goharbor.io/strigazi/busybox-multi/busybox:$i ; done

@strigazi
Copy link
Author

strigazi commented Dec 9, 2024

I would like to also add that in our environment I see the following WARNING in the core pods log. For 17 tags I see the WARNING 17 times.

2024-12-09T22:58:18Z [WARNING] [/server/v2.0/handler/assembler/report.go:104]: overview is empty, retrieve sbom status from execution

@stonezdj
Copy link
Contributor

The most time consuming sql is

SELECT b.id, b.digest, b.content_type, b.status, b.version, b.size FROM artifact_blob AS ab LEFT JOIN blob b ON ab.digest_blob = b.digest WHERE ab.digest_af = $1

It is called hundred of times, it is depends on current artifact has how many blobs, for each blob it checks the layer can be scanned. if not scannable, then the sbom_over_view will return empty, or it return not scanned information.
screenshot of call tree

Screenshot 2024-12-10 at 16 14 39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants