-
Notifications
You must be signed in to change notification settings - Fork 1
Limiting the building threads to 1 for compiling in the ROS buildfarm #1
Comments
Yes. By default, all CPU and RAM resources on the machine will try to be used.
Given the current packaging build timing, I'd estimate that a ROS build of Drake using a single-threaded build will take approximately 4 hours (assuming no caching from prior builds). Is that satisfactory?
If the buildfarm builds are only supposed to use 1 CPU, then to me the obvious way to implement that would be to only provide a single virtualized CPU in the build machine VMs, at the infrastructure level. Why would the buildfarm VMs provide >1 CPU when the policy is that more than one CPU must not be used? Solving this by dialing back every build tool's individual limit seems like playing whack-a-mole. In any case, if we assume that this needs to be a bazel-specific option, then see the docs at https://bazel.build/run/bazelrc. Instead of patching the source tree, we can put the |
4 hours might be problematic, if I'm not wrong the limit of the ROS buildfarm release jobs is set to 120 minutes right now for Rolling amd64. I'll check with the rest of the infra team but will open another issue to discuss potential reductions of this time.
There is parallelization done in the ROS buildarm but it happens at the executor level rather than build level (it can parallelize across packages but use a single thread for each package).
+1 I'll send the PR for patching the ROS buildfarm agents. |
Drafted a PR to be discussed with the ROS infra team ros-infrastructure/ros_buildfarm#1016 |
Are there any updates on this side of the question? I do anticipate that the Drake build will keep growing in size (build time) in future versions, so I'd like to get out in front of any potential challenges there. |
We have discussed this internally in the OSRF infra team. The decision of not supporting long (and/or memory intensive) builds was made consciously for trying to facilitate the operations (and the cost) of the ROS buildfarm by encouraging users to optimize for resource consumption and build times. This place us here in a special use case. That said, we have plans to support the Drake compilation:
|
ros-infrastructure/ros_buildfarm#1016 was merged. |
While testing the compilation of Drake I've found that it can take several dozens of Gb of RAM specially when processing the python bindings since bazel will launch a bunch of compilation threads.
The Drake buildfarm (if I'm not wrong) uses a
user.bazelrc
that define a--jobs
parameter calculated using the number of processors and the logic inbazel.cmake
. In the ROS buildfarm the rule is to use single threaded builds for memory and cpu predictability.For transforming the Bazel build in Drake to a single thread build, one option is to use the ament_vendor CMake API to include a simple patch againts
tools/bazel.rc
:I did not find a better way by using environment variables or other approaches that don't require to patch the source code.
The text was updated successfully, but these errors were encountered: