From a4448c537989bc029a057cd7b9aeb5c7fc4a23c4 Mon Sep 17 00:00:00 2001 From: per1234 Date: Sat, 2 Dec 2023 09:20:24 -0800 Subject: [PATCH] Fix macOS/BSD incompatibility in `general:check-filenames` task The "Check Files" (Task) template includes an asset task named `general:check-filenames` that checks for the presence of non-portable filenames in the project. Ironically, the task itself was non-portable. The problem was that it used the `--perl-regexp` flag in the `grep` command. This flag is not supported by the BSD version of grep used on macOS and BSD machines. This caused the task to fail spuriously with `grep: unrecognized option '--perl-regexp'` errors when ran on a macOS or BSD machine. The incompatibility is resolved by changing the `--perl-regexp` flag to `--extended-regexp`. This flag, which is supported by the BSD and GNU versions of grep, allows the use of the modern and reasonable capable POSIX ERE syntax on all platforms. Unfortunately the regular expression used in the previous command relied on one of the additional features only present in the PCRE syntax. This syntax was used to check for the presence of a range of characters prohibited by the Windows filename specification: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions > Use any character [...] except for the following: > - Integer value zero, sometimes referred to as the ASCII NUL character. > - Characters whose integer representations are in the range from 1 through 31 Due to the nature of these characters, they must be represented by code in the regular expression. This was done using the `\x{hhh..}` syntax supported by PCRE. Neither that syntax nor any of the equivalent escape patterns are supported by POSIX ERE. A solution is offered in the GNU grep documentation: https://www.gnu.org/software/grep/manual/grep.html#Matching-Non_002dASCII-and-Non_002dprintable-Characters > the command `grep "$(printf '\316\233\t\317\211\n')"` is a portable albeit hard-to-read alternative As also mentioned there: > none of these techniques will let you put a null character directly into a command-line pattern So the range of characters in the pattern can not include NUL. However, it turns out that even the previous command did not detect this character although it was present by the pattern. So this limitation doesn't result in any regression in practice. --- Taskfile.yml | 4 ++-- workflow-templates/assets/check-files-task/Taskfile.yml | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Taskfile.yml b/Taskfile.yml index 9032c1ec..4e6cbf94 100644 --- a/Taskfile.yml +++ b/Taskfile.yml @@ -429,8 +429,8 @@ tasks: ' \ basename "$0" | \ grep \ - --perl-regexp \ - --regexp='"'"'([<>:"/\\|?*\x{0000}-\x{001F}])|(.+\.$)'"'"' \ + --extended-regexp \ + --regexp='"'"'([<>:"/\\|?*'"'"'"$(printf "\001-\037")"'"'"'])|(.+\.$)'"'"' \ --silent \ && \ echo "$0" diff --git a/workflow-templates/assets/check-files-task/Taskfile.yml b/workflow-templates/assets/check-files-task/Taskfile.yml index bd3e68f0..6540de20 100644 --- a/workflow-templates/assets/check-files-task/Taskfile.yml +++ b/workflow-templates/assets/check-files-task/Taskfile.yml @@ -18,8 +18,8 @@ tasks: ' \ basename "$0" | \ grep \ - --perl-regexp \ - --regexp='"'"'([<>:"/\\|?*\x{0000}-\x{001F}])|(.+\.$)'"'"' \ + --extended-regexp \ + --regexp='"'"'([<>:"/\\|?*'"'"'"$(printf "\001-\037")"'"'"'])|(.+\.$)'"'"' \ --silent \ && \ echo "$0"