Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Path::glob() function #130

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Add Path::glob() function #130

wants to merge 7 commits into from

Conversation

mosra
Copy link
Owner

@mosra mosra commented Apr 2, 2022

[11:10] "hmm, let's stop procrastinating and add the Path::glob(), finally, can't be that hard"
[21:25] WHERE IS MY FLAMETHROWER

When I hit a bug in musl, which is used by Emscripten, I thought "okay fine, one minor wart, easy to workaround". When I hit another bug in an older version of musl, used by older Emscripten, which was impossible to workaround, I said "fine, let's just XFAIL this case". When I subsequently discovered that Windows conflates the error for "directory doesn't exist" and "no matching files found in said directory" I was a bit shaken already, but still managed to invent a workaround that isn't too complex.

But when I saw that the Windows CI is getting stuck on an ASSERT DIALOG BOX THAT I HAVE TO CLICK because I changed PluginManager to use glob() instead of list() + if(!filename.hasSuffix(ext)) continue;, and the error was that it matched a *.modconf instead of a *.mod file (and gotta say, I was lucky to have that test case uncover this GIANT problem), I started getting furious. Then, I discovered this amazing answer, and thought "okay, silly defaults and over-the-top backwards compatibility, I can understand".

Implemented, pushed ...and nothing changed. The reduced repro case matching *.txt files still matched also *.txtlol. Ultimately, what made me snap, was this post from Raymond Chen. YES, AMAZING JOB EXPLAINING HOW YOU "CAN'T" OR "IMPOSSIBLE" OR "OUT OF QUESTION". IT'S BEEN 17 YEARS SINCE THAT POST, WHY DOES IT TAKE SO DAMN LONG TO ADD ONE MORE FLAG TO FIX THIS?! The FindFirstFileEx() has so many possible extension points it makes me sweat, and yet there's still this damn issue where *.xls matches also *.xlsx, *.dat also *.data, THE WHOLE WORLD SUFFERS and you do nothing!!

So, since this is impossible, other options:

  • Revert the useless FindFirstFileEx() call because it does not add any value
  • "Never use any extension that would have the first three letters same as any other". NO! Sounds like a solution Microsoft would do (except they didn't, with *.docx, *.xlsx, ...). Am I Microsoft? No I'm not!
  • Give up on this cursed career, open a brewery, and get gradually pissed at that as well
  • Implement the globbing myself. Ugh. On Unix I could use fnmatch(), on Windows there seems to be ... PathMatchSpecEx that I could use as a post-glob filter? "Nice?" I even forgot such word existed.
    • Needs to be done on the wide string, of course, ugh, so I need to first split the basename part and then use that to match the filename. Ugh. Doing the UTF-16 expansion crap twice. Ugh.
    • This whole ordeal still has to be documented and preserved for future generations. To see what we wasted our lives on instead of trying to prevent the environmental apocalypse.
  • Due to the excessive workarounds, it no longer makes sense to implement list() using glob() on Windows, revert that
  • Yes and of course there are OTHER new bugs in PluginManager, apart from this assert. What's so strange about Bulldog.dll that a *.dll query cannot find it?! Figure out and fix. I hope it's something silly and not the damn glob not returning files it should besides returning files it shouldn't, because then it would be really crappy.
  • Also may want to check behavior with symlinked directories and parents, i.e. path/to/symlinked-dir/../*.txt may look somewhere entirely different than path/to/
    • For which it might be good to introduce Path::normalize() and/or Path::realPath()

mosra added 7 commits April 2, 2022 21:15
I was procrastinating this for way too long already. But as expected, it
was an immense load of pain.
Sigh. Not one day passes where I would just carry on coding against APIs
that are neither crap nor buggy.
Unfortunate. I don't want the application to complain give me the same
error when a directory doesn't contain any *.txt files or when the
directory doesn't exist at all.

The Path::glob() tests now fully pass.
…mes.

Wonderful, eh? The file.txtlol becomes something like FILE~1.TXT, with
only three letters from the extension used and the rest ignored.
Discovered completely by accident while trying to port PluginManager
plugin discovery to Path::glob() and it matched *.modconf files where it
should be matching just *.mod.
If someone told me, 10 hours ago, that I would run into problems like
this, I would just say fuck it and not even attempt to implement this
function. Well, here we are.

TODO: AND OF COURSE THIS CHANGE DOESN'T HAVE ANY EFFECT! FFS!!
TODO: so i guess here i'd need to implement this BY HAND?! Even though
   I'm forced to pass * to the damn thing to list stuff in the first
   place?! God fucking dammit.
TODO: https://devblogs.microsoft.com/oldnewthing/20050720-16/?p=34883
   HEY RAYMOND INSTEAD OF SAYING "CAN'T DO" OR "IMPOSSIBLE", WHY
   COULDN'T YOU JUST ADD ONE MORE FLAG TO THE EX FUNCTION?? IT'S BEEN 17
   YEARS SINCE THAT POST!!!
Halves the number of string allocations as we're skipping all *.conf
files this way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

1 participant