Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete ellipses transforms #9

Closed
4 tasks done
teddybradford opened this issue Mar 25, 2023 · 10 comments
Closed
4 tasks done

Incomplete ellipses transforms #9

teddybradford opened this issue Mar 25, 2023 · 10 comments
Labels
💪 phase/solved Post is done

Comments

@teddybradford
Copy link
Contributor

Initial checklist

Affected packages and versions

[email protected]

Link to runnable example

No response

Steps to reproduce

I think the ellipses transforms are incomplete.

  1. A period after ellipses seem to be treated as part of the ellipses.
  2. Spaced periods don't always get parsed if there's no space around the group.

Here are some examples of what happens currently:

foo.... bar -> foo… bar
foo. . .bar -> foo. . .bar
foo. . .. bar -> foo… bar
foo. . . . bar -> foo… . bar
foo . . .bar -> foo . . .bar
foo. . . .bar -> foo… .bar

Expected behavior

I would expect those examples to return these values instead:

foo.... bar -> foo…. bar
foo. . .bar -> foo…bar
foo. . .. bar -> foo…. bar
foo. . . . bar -> foo…. bar
foo . . .bar -> foo …bar
foo. . . .bar -> foo. …bar

Affected runtime and version

[email protected]

Affected package manager and version

No response

Affected OS and version

No response

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Mar 25, 2023
@wooorm
Copy link
Member

wooorm commented Mar 25, 2023

a) why would more than 3 dots not turn into ellipses? (1, 3, 4, 6)
b) I’m not sure what 2 and 5 are supposed to be, I don’t think I‘ve seen people write such characters in English or other languages that I am aware of, flush in the middle of words, or sticking to a next word. retext is about natural language, not programming code or so, it cuts stuff up in sentences, and I don’t understand how sentences work here, retext probably doesn’t either

@teddybradford
Copy link
Contributor Author

teddybradford commented Mar 25, 2023

a) Some style guides use a four-dot ellipsis (ellipsis + period) when crossing sentences:

The MLA now indicates that a three-dot, spaced ellipsis . . . should be used for removing material from within one sentence within a quote. When crossing sentences (when the omitted text contains a period, so that omitting the end of a sentence counts), a four-dot, spaced (except for before the first dot) ellipsis . . . . should be used.

https://en.wikipedia.org/wiki/Ellipsis#American_English

b) These cases can probably be ignored then. But as far as I can tell, many style guides don't explicitly say that the spaced ellipses must be surrounded by spaces (i.e., foo . . . bar vs. foo. . .bar)

@wooorm
Copy link
Member

wooorm commented Mar 30, 2023

Thanks for the link!

To me, your argumentation for a), explains why some people use 4 dots. Not why it should turn into …., as I don’t see an example of the output?
I do see one case of that, in the French example.

I think what’s complex about trying to follow these style guides, is that they’re all different, they each have different rules, and then also for authors and for “typesetters”.

But this project doesn’t follow one specific styleguide. And if we’d do, we’d break with the rest, right?

The reason for why four and more periods are turned into ellipsis, is because some humans don’t stop at 3. For example: Wait..... what’s wrong with that?.

@teddybradford
Copy link
Contributor Author

teddybradford commented Apr 11, 2023

The more I read and think about it, the more complex I realize this is. It's difficult (impossible?) to parse for all these cases, and intuit, with accuracy, what the intended behavior should be—especially if a text uses varying types of ellipses.

With that in mind, would you consider adding an option to this package that toggles converting triple-spaced dots (but keeps consecutive-dot conversions for ellipses)? This would give more flexibility, making it easier to apply custom, context-specific regexp replacements after running text through this plugin.

@wooorm
Copy link
Member

wooorm commented Apr 12, 2023

the more complex I realize this is

Yep, same.

would you consider adding an option to this package that toggles converting triple-spaced dots

Yes, I am open to such a feature.
I am not interested in writing it myself though: are you?
I’d also wonder how it would look exactly, so that can be some back-and-forth to figure out!

@teddybradford
Copy link
Contributor Author

Maybe something like updating options.ellipses to work like options.dashes:

Create smart ellipses (boolean or 'unspaced', 'spaced', default: true).

Converts triple dot characters (with or without spaces) into a single unicode ellipsis character.

@wooorm
Copy link
Member

wooorm commented Apr 20, 2023

Maybe! Probably good, but might depend a bit on how the feature you come up with actually works!
Previously we also discussed triple vs more-than-triple, was that also planned in this PR/option?

@teddybradford
Copy link
Contributor Author

I was thinking of keeping it simple and only adding options for enabling/disabling formatting of spaced vs. unspaced ellipses (keeping the ellipses length regex as-is).

This comment has been minimized.

@wooorm
Copy link
Member

wooorm commented Jan 13, 2024

released! https://github.com/retextjs/retext-smartypants/releases/tag/6.1.0

@wooorm wooorm added the 💪 phase/solved Post is done label Jan 13, 2024
@github-actions github-actions bot removed the 🤞 phase/open Post is being triaged manually label Jan 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 phase/solved Post is done
Development

No branches or pull requests

2 participants