Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow disabling unsafe types? #483

Closed
rexxars opened this issue Aug 16, 2021 · 6 comments · Fixed by #717
Closed

Allow disabling unsafe types? #483

rexxars opened this issue Aug 16, 2021 · 6 comments · Fixed by #717
Labels
enhancement Add new functionality

Comments

@rexxars
Copy link

rexxars commented Aug 16, 2021

There's a few file types that are marked as "unsafe" - and looking at the detection signatures I assume it's because the check is somewhat simple/naive. I have a gLTF file that is being detected as an ico, for instance.

Would it be possible to either:

a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar)
b) Return an unsafe: true in the return value for the unsafe types
c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side

The issue with c) is that some of the unsafe file types does have more "safe" routes (like mpg).

@Borewit
Copy link
Collaborator

Borewit commented Aug 16, 2021

I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:

file-type/core.js

Line 1179 in c037ba7

// -- Unsafe signatures --

these are much more likely to be false positives.

@Borewit Borewit added the enhancement Add new functionality label Aug 16, 2021
@rexxars
Copy link
Author

rexxars commented Aug 17, 2021

I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:

file-type/core.js

Line 1179 in c037ba7

// -- Unsafe signatures --

these are much more likely to be false positives.

Yep, apologies for the vagueness. I've updated the original issue with a link.

Would you be willing to accept a PR? Any preferred approach?

@Borewit
Copy link
Collaborator

Borewit commented Aug 17, 2021

I had a look to the MPEG-1 detection based on your feedback.

file-type/core.js

Lines 1181 to 1189 in c037ba7

if (
check([0x0, 0x0, 0x1, 0xBA])
|| check([0x0, 0x0, 0x1, 0xB3])
) {
return {
ext: 'mpg',
mime: 'video/mpeg',
};
}

some of the unsafe file types does have more "safe" routes (like mpg).

I don't think there is an alternative detection for that present in the code.

I actually tlooked into making it more safe. From the non official specs it looks the current filter is actually already to narrow (at least on the bytes it is testing on). It's currently to specific on the stream-id.

a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar)

I think it makes sense to optionaly exclude likely false positives (the so called unsafe signatures). I would prefere includeUnsafe.

b) Return an unsafe: true in the return value for the unsafe types

Risk on false postives also exist for signatures and for some more likely to occure then others. But it makes sense as well to indicate that the likelyhood of a false postive is significantly higher.

c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side

I am not a big fan of exposing potential outcomes as part of the API to begin with.

Additonally:

  • We could try these file type (signature) recognition
  • We could move the unsafe signatures furher down, so give more presedence to safe longer signatures.

@Borewit
Copy link
Collaborator

Borewit commented Aug 20, 2021

Related to unsafe types: Borewit/peek-readable#356 (comment)

pointing out is is also guessing the file-type here:

file-type/core.js

Lines 176 to 180 in c037ba7

// Guess file type based on ID3 header for backward compatibility
return {
ext: 'mp3',
mime: 'audio/mpeg',
};

@Borewit
Copy link
Collaborator

Borewit commented Dec 18, 2024

After #704 we could group the unsafe detections to its own method, which allows te remove them from the detector list. Maybe we should extend the detector interface a bit, so that parsers have something like a name / id, so you manipulate this list of detectors.

fileTypeParser.detectors = [
  {id: 'default', detect: (t) => detectUnsafe(t))},   
  {id: 'default.unsafe', detect: (t) => detectUnsafe(t)}  
] 

You could then get rid of the detectors:

fileTypeParser.detectors = fileTypeParser.detectors.filter(detector => detector.is !== 'default.unsafe');

@sindresorhus
Copy link
Owner

👍

We should then make id mandatory also for third-party detectors, so it can be relied on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Add new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants