Scalability and performance #122

manthey · 2024-03-26T20:03:31Z

We'd like to conduct an experiment to determine the effect of different CPU/GPU and memory on performance.

A proposed course of work (feedback encouraged):

We could deploy an instance of DSA on different AWS EC2 instances and compare the time for first superpixel and feature generation and the time of a few training iterations.

Possible things to compare to produce some benchmarks:

number of cpus or cores
availability of gpu (possibly try different gpu classes)
memory (often coupled with cpus)
images on local block storage versus S3
number of images. We could have a few test sets, maybe take a large number of images from the TCGA collection and measure the speed of different numbers of images in the different configurations. Ideally, we'd like to try sets that substantially exceed the number of cpu cores so the work can saturate the hardware. I might try powers of 2 or 4 (e.g, 1 image, 4, 16, 64, ...)

Ideally we'd have some infrastructure-as-code way to deploy this so that we can reproduce the results, at least for deploying to a specific EC2 instance style and uploading our data, even if we kick off the individual jobs manually.

manthey · 2024-03-26T20:04:45Z

@bnmajor @jeffbaumes I'd love for one outcome of this to be some description of how to deploy DSA with a custom provisioning file to EC2 in some standardized manner (recognizing that the girder-next work may change what we do).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalability and performance #122

Scalability and performance #122

manthey commented Mar 26, 2024

manthey commented Mar 26, 2024

Scalability and performance #122

Scalability and performance #122

Comments

manthey commented Mar 26, 2024

manthey commented Mar 26, 2024