Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a custom distutils command to symlink data_files #592

Merged
merged 7 commits into from
Feb 3, 2024

Conversation

cottsay
Copy link
Member

@cottsay cottsay commented Oct 11, 2023

The default implementation of install_data will always copy files into the destination directory. When we use the 'develop' command, we actually need to specifically tell setuptools to do something with the data_files or they will be ignored. Instead of telling setuptools to use install_data as-is, we can implement a custom version of install_data that will try to symlink the files instead.

The default implementation of install_data will always copy files into
the destination directory. When we use the 'develop' command, we
actually need to specifically tell setuptools to do something with the
data_files or they will be ignored. Instead of telling setuptools to use
install_data as-is, we can implement a custom version of install_data
that will try to symlink the files instead.
This matches the existing behavior in colcon
@cottsay cottsay added the enhancement New feature or request label Oct 11, 2023
@cottsay cottsay self-assigned this Oct 11, 2023
@cottsay cottsay linked an issue Oct 11, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Oct 11, 2023

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (ad0feb4) 83.66% compared to head (f15b533) 82.06%.

Files Patch % Lines
colcon_core/distutils/commands/symlink_data.py 0.00% 15 Missing ⚠️
colcon_core/task/python/build.py 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #592      +/-   ##
==========================================
- Coverage   83.66%   82.06%   -1.61%     
==========================================
  Files          65       66       +1     
  Lines        3765     3746      -19     
  Branches      728      713      -15     
==========================================
- Hits         3150     3074      -76     
- Misses        541      586      +45     
- Partials       74       86      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@RodBelaFarin
Copy link

Tested this PR and it works as expected!
I've changed a *.launch.py file and an *.urdf file. Both changes were applied without the need to run colcon build again.
Thank you so much! Finally --symlink-install works as it intends to work.

@cottsay
Copy link
Member Author

cottsay commented Nov 30, 2023

Finally --symlink-install works as it intends to work.

--symlink-install works as expected with ament_cmake packages as it is. This PR is about tricking setuptools into doing what we want it to so that Python packages work as well.

@cottsay cottsay linked an issue Jan 22, 2024 that may be closed by this pull request
@tonynajjar
Copy link

Hi many are eagerly waiting for this PR and what it promises 😄, any update on when it might be finished?

@cottsay
Copy link
Member Author

cottsay commented Feb 2, 2024

any update on when it might be finished?

Nobody has reviewed the change. I've got feedback from @RodBelaFarin that it worked as they expected (thank you!), but more feedback would be better.

Have you tried the change yourself, @tonynajjar? If it works, I'd like to hear about it.

@cottsay cottsay marked this pull request as ready for review February 2, 2024 15:38
@tonynajjar
Copy link

tonynajjar commented Feb 2, 2024

Works like a charm! I tested it on humble on a launch file and a yaml file. @nuclearsandwich @clalancette can we please have a review for this long-awaited PR? I think it's a big source of confusion especially for beginners when --symlink-install does not work as intended with pure python packages.

Copy link
Contributor

@nuclearsandwich nuclearsandwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment is clarification and doesn't have any blocking suggestions.

colcon_core/task/python/build.py Show resolved Hide resolved
@RodBelaFarin
Copy link

RodBelaFarin commented Feb 2, 2024

I'm not sure if this should be discussed on ROS Discourse first or here or at all, but espacially after reading @tonynajjar's comment

@nuclearsandwich @clalancette can we please have a review for this long-awaited PR? I think it's a big source of confusion especially for beginners when --symlink-install does not work as intended with pure python packages.

I also have to speak up for all beginners using this build tool for ROS(2) and raise an even bigger question to this PR:

Wouldn't it make sense that --symlink-install becomes the default build task for colcon build anyway and instead reverse it and add an option like --no-symlink-install?

At least I am using it much more this way than the other way around and the default CLI command should be the default behavior, doesn't it?
But maybe I am not aware of all use cases and missing an important point here.

Shortly to my background: I am being a university educator and teaching ROS for 10 years now and also did external trainings for the EU ROSIN project, so I've personally trained multiple hundrets and thounds of people in ROS1 and ROS2. Most of them do not have a background in computer science and most of the times never worked with a Linux based OS, you know... the typical Windows user. So using a terminal emulator is already a hurdle for them and they don't know about features like bash-completion or similar. This means they are going to try to type colcon build --symlink-install from their memory by staring on on the screen instructions, which goes wrong a lot of times, which frustrates the students (and in the end me ;-)).

With this PR's --symlink-install becoming a functional option for Python based packages, I'd prefer to see that becoming the default case. I would say this would take away ONE of the huge hurdles for beginners with ROS2 development and I guess it should be as user-friendly as possible.

@nuclearsandwich
Copy link
Contributor

At least I am using it much more this way than the other way around and the default CLI command should be the default behavior, doesn't it?

You mentioned above that you might not be aware of all use cases, and I think that's important to bear in mind. The most commonly used options in the education and outreach space are not necessarily the best general defaults. In this specific case, a symlink installation does not produce a relocatable installation space suitable for bundling or distributing, something that colcon is used for both on the ROS 2 CI farm and by other groups building ROS workspaces.

Making the "most popular" or "most common" options the default is one strategy if there's a very clear winner of that category but I think another approach is to make the default the "least surprising" behavior and for a verb like install truly performing the installation is much less surprising than defaulting to a symlink install. Especially for those coming from earlier build tools in the ROS ecosystem.

To reduce the overhead for students, you could recommend a configuration setup using colcon-defaults although at the risk of soapboxing, I think that using non-standard defaults can introduce more confusion in the long term than adding additional command line options to support a specific workflow.

@cottsay
Copy link
Member Author

cottsay commented Feb 3, 2024

Thanks @RodBelaFarin and @nuclearsandwich for your thoughts.

I have three of my own to share:

  1. Even if a Python developer may expect changes to take effect immediately after saving a file, a C++ (or Rust) developer would expect nothing to change until the code is compiled. From this perspective, symlinking can bring unexpected behavior if implicitly enabled. Additionally, strange things can happen if a long-standing workspace changes significantly as would happen when using vcs pull to update a patch release. The native python code would be mostly updated (see [1]), but a build would be required to update the compiled packages.
  2. It would simply be too painful to invert the implicit behavior at this point. As @nuclearsandwich discussed, there are numerous applications of colcon in CI and other contexts where the results are archived for download or distribution. Symlinking fundamentally breaks that use case. A three-phase deprecation cycle would be necessary to throw warnings first at users not using any flag, and then at users using the old flag. It's hard to justify a deprecation warning to users who don't specify any flag at all stating that the way they invoked the tool will change behavior.
  3. All of this is deprecated. The approach taken here has no corresponding change in standards-based Python packaging, which is where we're going next. I expect this code to be moved out of colcon-core in the not-too-distant future. (details here)

Getting this change merged and matured in colcon-core is step zero. Even if there were a clear path forward, I can't see us making any moves until we're absolutely sure that this approach is correct and stable. It took spelunking through setuptools to understand how to make this change to begin with, which is probably why this wasn't implemented long ago.

All this said, I still believe symlink installs have utility and I use them regularly, but I'm also acutely aware of the limitations.

[1] Symlink installs only work as expected for existing files and Python modules. If you add a new top-level Python module or a new data file, you're going to need to build anyway to create the new symlinks in the install tree. Understanding when this is necessary could be just as confusing for new developers in my opinion.

@cottsay
Copy link
Member Author

cottsay commented Feb 3, 2024

Note that we won't see test coverage in the added file symlink_data.py due to #579, which is why codecov is reporting such a reduction. I simply don't see a way around it at this point, and I'd rather focus on getting standards-based python builds online, which won't have the same issue.

@cottsay cottsay merged commit f7dfec7 into master Feb 3, 2024
35 checks passed
@delete-merged-branch delete-merged-branch bot deleted the cottsay/symlink-data branch February 3, 2024 04:35
@cottsay cottsay added this to the 0.15.3 milestone Feb 3, 2024
@cottsay cottsay mentioned this pull request Feb 3, 2024
@tonynajjar
Copy link

Any estimate on when you plan on releasing 0.15.3? It says it's 100% complete 😁

@robotasun
Copy link

May I ask some particulars about the new symlinking behavior?
I typically include some of my necessary configurations, data, etc. in arranged folders and add them to my share space, e.g.

setup(
...
data_files=[
...
        # Include all configuration files (preferably with --symlink-install)
        ('share/' + package_name + '/config', glob('config/*')),
        # Include all data files (preferably with --symlink-install)
        ('share/' + package_name + '/data', glob(os.path.join('data', '*.np[yz]'))),
...
],...
)

Does using --symlink-install extend to all added data_files, or does it have certain limitations to which file types that can be added?
Furthermore, in the example above, let's say that I prefer saving over these *.np[yz] files as a consequence of a node running, intermittently or before exiting by using ament libraries to point at them. Would this carry any risks other than typical ones?
I'll for sure test my own examples myself, but having a few "bad" examples that I didn't perhaps add could be helpful for everyone

@cottsay
Copy link
Member Author

cottsay commented Mar 18, 2024

Does using --symlink-install extend to all added data_files, or does it have certain limitations to which file types that can be added?

All files listed under data_files are now symlinked when you use --symlink-install. The symlinks happen at the file level, so no directories are symlinked and the structure should match the previous non-symlink behavior exactly.

...let's say that I prefer saving over these *.np[yz] files as a consequence of a node running, intermittently or before exiting by using ament libraries to point at them. Would this carry any risks other than typical ones?

I think that modifying files in your install space at runtime is an antipattern even outside the context of symlink installs. When packaged and installed as a deb, those files would be owned by root and wouldn't be writable by a normal user anyway. Just like how we generally treat the sources as read-only during the build, the install space is generally treated as read-only at runtime.

That said, I think it depends on precisely how you're saving over the files. If you open the file read-write at runtime, it will resolve the symlink and any changes you make will be made in the package sources. If you create a new file and replace the file in the install space, the symlink would be replaced with the new file, at least until the package is rebuilt.

@thandal
Copy link

thandal commented Mar 21, 2024

Any estimate on when you plan on releasing 0.15.3? It says it's 100% complete 😁

I just apt upgraded to python3-colcon-core (0.16.0-2), and symlinks are working great! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment