-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source position information supported? #36
Comments
I haven’t seen anyone ever generating source maps for a compile-to-HTML language, and I don’t believe browsers support source maps on HTML. Your question is very short. Perhaps you can spend some time framing it in more detail? https://github.com/wooorm/markdown-rs/blob/main/.github/support.md |
Sorry, my fault, I really wasn't very clear. I really mean source position information, not source maps. I'm looking for source position information, similar to what is provided in https://github.com/commonmark/commonmark.js, which outputs a I believe your Been using I'm also hoping to find some level of extensibility, hopefully at the markdown grammar level as well as the AST level. Edit: Another thing driving this is having a relatively common parser in Rust for backend and one in JS for the frontend. |
Yes, an AST is supported here too, and the AST contains positional info. See the docs for more info! https://docs.rs/markdown/1.0.0-alpha.5/markdown/fn.to_mdast.html
No Rust parser provides this to my knowledge. I don’t think it can be achieved in Rust.
See the other open issues/PRs, and the issue tracker on https://github.com/wooorm/mdxjs-rs, for more on this! |
Ah ok, I see it in the AST now. But I still would need HTML output. It looks like
Have you taken a look at https://github.com/rlidwka/markdown-it.rs - it seem it been able to achieve it, or at least it seems like it. Or are we talking about two different thing? |
Maybe, I’d like to see more examples, e.g., math, frontmatter, directives. The extensions they show as examples, are each better done on the AST I believe. It will still likely be impossible to support here: this is based on enums to switch between states. Enums cannot be extended from outside of a project in Rust AFAIK. |
One example of something that would be difficult to implement on the AST would be supporting multiline blockquotes. They should behave similarly to a code block in terms of syntax, but instead of wrapping as a code block, the content would be wrapped in a I don't see a way that this could be implemented properly on an existing AST. We currently implement it as pre-processing step on the raw markdown, but it's not at all ideal. |
True, that’s not really possible on an AST. I am personally pretty strongly against adding more syntax extensions to markdown. I think it makes markdown less portable. I wrote a bunch about it here: https://github.com/micromark/micromark#extending-markdown. |
In general I agree with you. I prefer not adding to the markdown syntax unless it can be portable. However multiline blockquotes in GitLab have been around awhile (and are actually very useful). We pre-process the raw markdown, which is prone to errors. Being able to have the parser handle it in the proper context is useful. But I do think a couple extensions should be added to markdown, such as definition lists, math (which you already have implemented). emojis and an attribute syntax so that image sizes can be specified. I just wish CommonMark would move forward on some common extension syntax. 😢 I like some of the work that @jgm has done in https://github.com/jgm/commonmark-hs/tree/master/commonmark-extensions/src/Commonmark/Extensions. As the author of CommonMark, I feel a little more comfortable following those extensions. For example our image sizing follows along with his attribute syntax I don't know a lot about MDX, but it seems to be geared toward javascript, and basically looks like XML/HTML. In that case, it seems like just using HTML would be better - I'm not sure how MDX would make HTML definitions lists any better/less complicated. And I admit I don't know anything about directives yet. Edit: I think directives would certainly solve some markdown extension problems. There is still an issue regarding portability, since similar behaving directives could be named differently. But it would provide well-behaved hooks for custom rendering. But I'm not sure it would help with say definition lists. |
That’s why I try to push people towards directives. They are one syntax that solves all other syntax extensions. And MDX is an alternative for that, useful for programmers.
Too could be done with directives
You probably guessed, I think the main thing is to add directives, then there might be some small improvements, but otherwise I think the stretch is out of markdown, and it isn’t going to see new syntax extensions ever. Like, can you imagine HTML being extended with a new syntax? Suddenly having a
Aside, but that’s also a problem. I don’t believe a single person shouldn’t impact a language used by zillions like that. I’d prefer CommonMark being more of a committee, with more formalized governance. And not having a (very smart) single person typing up some things that aren’t specced and having it become “de facto”.
Yes
It does for literate programming cases. And MDX adds JavaScript that can be evaluated.
They solve all your needs! 😅
True! |
I took a look at https://github.com/commonmark/commonmark-spec/wiki/Generic-Directive-Extension-List, and I kept wanting each proposal to have an actual example, to make it concrete. I think it opens up the possibility of adding certain custom rendering, but it's not effective for portability unless there is a standard agreement for how a specific directive works. Take the YouTube example from remark-directive, Without consensus, it's entirely possible and likely that someone will code their implementation to use I also don't see how it would address a definition list syntax. I suppose you could have a block that starts with On the flip side, there is already a format for definition lists that have been widely used. I use them all the time, and I think it's a relatively elegant solution to that particular problem. I do agree that it's not feasible to keep adding new syntax ad infinitum. But HTML does in fact add new syntax. For example Having something well defined in the CommonMark spec, or a CommonMark extension spec, would actually bringi wider adoption and portability. GFM is a great example.
The problem is I'm already there. I have quite a few requests for definition list support. Same way I had a huge number of requests to provide a way to size markdown specified images. The community hasn't made much progress in the last decade on agreeing to a syntax for these extensions. I've been waiting, hoping. At some point, one has to decide to move forward anyway, picking the best, most commonly adopted syntax. That's why I tend to look to https://github.com/jgm/commonmark-hs/tree/master/commonmark-extensions. I know that most things he puts together has thought and portability in mind. Whether it ends up being "the final" spec for something, who knows.
I would agree with that, as long as they had a mandate to actually make decisions. Take everyone's input on proposed extensions, but finally take a decision. So while I think directives are interesting and useful, and in general I support them, I don't think they are a panacea. I think having the ability to add extensions to the parser when other options don't make sense, is important. And when you have a large legacy of markdown data, as we do, it's important to be able to continue to support it. Which is what extensions would enable us to do. |
Hi again! :)
You may have seen this but an example of how to use them is in https://github.com/micromark/micromark-extension-directive. A bit down there’s a description of the syntax.
That’s why I want this in a spec or embraced by GH!
I fail to see how some new syntax wouldn’t have the same problems as directives that you describe?
Agreed. You need consensus. With any syntax
It has some problems:
Strong disagree: that isn’t a new syntax, it’s new semantics. Importantly, Perhaps of interest: some similar discussion is here: https://github.com/orgs/community/discussions/16925#discussioncomment-2791869 :) |
Hey
Ok, the difference between syntax and semantics. So we agree that you need consensus on the syntax - the actual syntax of how a directive is written. But the HTML spec is also built on a consensus of semantics, such as Same with directives. Even with an agreed upon syntax, you would need some consensus on the semantics - is it
I can't make any promises, but it's something I would consider looking at adding to GitLab.
Well, it's an extension for a reason. I'm not advocating adding it to CommonMark core. And there wasn't a consensus on markdown until CommonMark arrived. But there are many people using a specific syntax, and by settling on that, you can serve a lot of people, and drive wider adoption. Most implementations I've seen for definitions lists use that syntax. For example wataru-chocola/remark-definition-list I would just like the option to be there, for an organization that wants to use And I have yet to see a better syntax for it. I have no clue how directives would even approach this, without driving the document author crazy. |
Yep. Might be useful to bake that in from the start! Without it, we at least have the syntax. And everything is “custom”. Better than before in my opinion, but not super portable.
I didn’t know your “we” was GitLab. Interesting! Yes, please do! :)
A particular problem exists around definition lists: it’s basically an alternative for writing HTML
The complexity here for me, while I understand it’s useful to you, is how to best serve the markdown world? Less extensions is in my head better. Some extensions (e.g., math) are okay. Tough to weigh! |
I understand what you're saying, and I don't think there is anything wrong with having the crate be opinionated. I would add your directive functionality as a part of this crate, controlled via a switch as you do the math support. Assuming you feel the spec of it is complete enough. It would be best if the syntax could be accepted as a core of CommonMark, so that there would be a defined fallback if a parser didn't support a particular directive. Even showing as a code block would be sufficient. As it stands it would just be run-on text. But you'll have to win that battle on the CommonMark forum. But in my own opinion, serving the CommonMark community is also supporting the ability for devs to extend via their own extensions. If someone is writing something green, brand new, maybe they have the luxury of not needing any extensions. But if they need to support any legacy data, that may use extensions (maybe coming from the remark/micromark ecosystem), then I don't think cutting those off is the best. At GitLab we've been very limited in the extensibility of our current parser. This hampers us in being able to support not only features that customers want, but in performance and correctness. Here's an example. We need to be able to know when a character has been escaped. This allows us to short circuit certain handling, such as user mentions. This is very difficult to do without access to the parser, requiring a pre-processing step, and a post-processing step. And even then I think it's missing a couple corner cases. This type of work is much better suited for the parser/renderer. Heck, ideally, we'd build an extension specifically for user mentions (and our other special syntax, not unlike GH's Anyway, at least for my case, an extension system is important. And I would venture that it would be important for a lot in the community as well. |
A bit snarky, but the user is often wrong (and also often right, at the same time).
GH implements references (to users, to issues, to commits, to CVEs) on an HTML AST. It doesn’t know about escapes either:
I understand this! I don’t think you’re wrong. I think there are trade-offs. I think it’s better for markdown to not add a lot of syntax extensions. I think it’s better for vendors to not add custom syntax extensions that don’t work in other places. |
Yeah but when they're right, they are right. And it's then incumbent on me, as a provider, to make things work as best they can to solve their problems. For example the escaping issue. They are absolutely right - when you write
I also posted on the cmark forum a couple years ago about it: commonmark/cmark#366 I'm incredibly disappointed that it was impractical to fix this any other way than we did. It's a real hack. But in most cases, it solves a customer problem and annoyance. One less of a thousand cuts. If the library supported extensions, after failing to get the library authors to add the capability, I could have added it myself.
I'm very reticent on adding any new syntax or AST transformations, which tend to be just as unportable. I spend a lot of time looking at alternatives, most commonly accepted solutions, as well as pushing back. And there are times, and customer requirements, that require a solution. Plain and simple. Adding something via the AST is useful in many cases, and a hack in others. If something needs to get added, it can many times be made more CommonMark complaint by having the option of adding it at the proper place in the parsing chain. And many AST transformations are indeed adding syntax which is not portable. And remember, some syntax is not meant to be portable. Some features, such as @ user mentioning or issue referencing, make no sense in other contexts. In any case, I'm not sure I've moved the needle at all in this discussion, which is fine. I do think it's a bummer that you provide the ability to have a rich ecosystem for remark/micromark, and that it won't carry to the Rust version. I know for us, based on our requirements and experience thus far, using a system that doesn't provide us with that capability is a tough sell.
I think it's better for markdown to have the CommonMark community/writers push forward on finally solving some of the many discussions around extensions, various proposed syntaxes, etc. I will work hard to have our implementation fall in line with any real consensus. Until then, features will continue to be added by implementors, such as the proposed note syntax, that don't really line up well with CommonMark. 🤷 |
I argue they are typically right about the problem. Not right about what they propose as a solution.
There are significant benefits to traversing syntax trees for several features as opposed to plugging into the parser. (Not always: the math extension supported by GitHub is terrible!). Especially in tools that support a subset of HTML. It makes, for example Note, we already have character escapes. What you might want in this case, is a CST. We expose all this info (it’s not obvious and pretty yet):
I’ve kept this somewhat hidden until people need it. With those needs, we can design good APIs.
I argue that the rich ecosystem is due to syntax trees, which we have some of already, and plugins, which I want to add here too.
I think forks might be quite fine for the needs of GitLab. That’s what GitHub does too with No parser that I am aware of outside of markdown support syntax extensions. Babel doesn’t support this. Nobody extends HTML with new syntax.
I’m happy to discuss this. I discuss it with many people. For years. I don’t always hold the same opinion as other times. So yes, the needle moves. But not too much haha! |
Wow, no, totally disagree. Yeah when that's your only option, sure that's what someone has to do. But making someone write "Firehouse |
Are source maps supported, and if so to what degree? Blocks only, or full (including embedded HTML)?
The text was updated successfully, but these errors were encountered: