From a2afb03dc831c05fb631ffbe46d357b17f4a2a10 Mon Sep 17 00:00:00 2001 From: John Haugabook Date: Sun, 3 Sep 2023 21:01:48 -0400 Subject: [PATCH] roffit: fix special characters and broken links (#9) * fix special characters and broken links: testpage.1 Changed testpage.1 back to source. * fix special characters and broken links: testpage.output Reverted testpage.output back to source. * fix special characters and broken links: roffit Fixed one tabbed indent, and tried to implement corrections in "text2name()" and "do_encode()" again. Results - this is the best I'm going to do. Was not able to get working in "do_encode()" at all. In "text2name()" had very minor unexpected results in one file "brew.1", and was not able to resolve. It reproduced multiple links of the same link on several lines in that file. --- roffit | 2 +- testpage.1 | 35 ++--------------------------------- testpage.output | 17 +++-------------- 3 files changed, 6 insertions(+), 48 deletions(-) diff --git a/roffit b/roffit index db9f51a..fb89841 100755 --- a/roffit +++ b/roffit @@ -329,7 +329,7 @@ sub linkfile { $field =~ /(^|\W)((https|http|ftp):\/\/[a-z0-9\-._~%:\/?\#\[\]\@!\$&'()*+,;=]+)/ || $field =~ /(^|\W)RFC ?(\d+)/ ) { - convert_html_links(); # Run link conversion in subroutine to save time. + convert_html_links(); # Run link conversion in subroutine to save time. } if ( # Fix special characters in subroutine to save time. $field =~ /(<\/a>)(-)([$specialcharacters]|[\[\]])((,)|(<\/span>))/ || diff --git a/testpage.1 b/testpage.1 index eedb30e..0d6ed81 100644 --- a/testpage.1 +++ b/testpage.1 @@ -19,47 +19,18 @@ but ARFC 959 is a fake RFC3986 is URI syntax -Full inline urls with http, https, or ftp protocols are converted to -html links. For example: - -https://curl.se/, ("https://curl.se/docs"), (https://daniel.haxx.se/projects/roffit/). - -But when the url is in an example it is not converted into an html link.. - -.nf - cmd -opt https://example.com -.fi - -When the protocols are declared within punctuations, but have no domain they remain -as an element with the href attribute removed. For example: - -("http://"), (https://), and "ftp://"; are elements, while unpunctuated -http:// or https:// or ftp:// are inline text within

element. - -Additionally special character options will have anchor links. -So if an option similar to the one below is documented; -it will have an anchor and anchor link: - -.IP "-?, --special-char" -Options with special characters will be included in anchors. -Such as this option with special character \fI\-?, \-\-special\-char\fP. - Also, we must support \fIstyle staring on one line and ending on another that may\fP be multiple lines off. - .SH OPTIONS .IP "--bare" -The output HTML will not include any HTML, HEAD or BODY tags. Also note that +The output HTML will not include any HTML, HEAD or BODY tags. Also not that when this is selected, there will be no inlined CSS but you will have to define the necessary classes yourself. - .IP "--mumbo" Display version number and exit. Also see \fB--jumbo\fP. Word in bold. - .IP "--jumbo" Cool option. See \fI--mumbo\fP. Word in italic. - .IP \-\-mandir=

Set a directory in which \fIroffit\fP will check for other man pages (in nroff [name].[num] format) that this one refers to. If found, a link will @@ -70,11 +41,9 @@ name in the generated link will be prefixed by the dir given with This works for references specified as \fImanpage(3)\fP (within the emhpasis foformatting) and in a plain \.BR section (often used in the SEE ALSO section). - .IP \-\-hrefdir= Specify a directory to prefix generated href links created with the \-\-mandir option. This defaults to ".". - .SH "CSS CLASSES" .IP h2.nroffsh The nroff ".SH" section. These are normally the "headlines" before each sub @@ -96,6 +65,6 @@ Text marked as a reference to another man page. .IP span.emphasis Text marked to be emphasized. .IP p.roffit -Used for advertising final paragraph. +Used for the advertising final paragraph. .SH WWW http://daniel.haxx.se/projects/roffit diff --git a/testpage.output b/testpage.output index baa2925..4aedf75 100644 --- a/testpage.output +++ b/testpage.output @@ -6,20 +6,9 @@

(RFC 959) is FTP

but ARFC 959 is a fake

RFC 3986 is URI syntax

-

Full inline urls with http, https, or ftp protocols are converted to html links. For example:

-

https://curl.se/, ("https://curl.se/docs"), (https://daniel.haxx.se/projects/roffit/).

-

But when the url is in an example it is not converted into an html link..

-   cmd -opt https://example.com 
-
- -

When the protocols are declared within punctuations, but have no domain they remain as an <a> element with the href attribute removed. For example:

-

("http://"), (https://), and "ftp://"; are <a> elements, while unpunctuated http:// or https:// or ftp:// are inline text within <p> element.

-

Additionally special character options will have anchor links. So if an option similar to the one below is documented; it will have an anchor and anchor link:

-

-?, --special-char

-

Options with special characters will be included in anchors. Such as this option with special character -?, --special-char.

-

Also, we must support style staring on one line and ending on another that may be multiple lines off.

Options

+

Also, we must support style staring on one line and ending on another that may be multiple lines off.

Options

--bare

-

The output HTML will not include any HTML, HEAD or BODY tags. Also note that when this is selected, there will be no inlined CSS but you will have to define the necessary classes yourself.

+

The output HTML will not include any HTML, HEAD or BODY tags. Also not that when this is selected, there will be no inlined CSS but you will have to define the necessary classes yourself.

--mumbo

Display version number and exit. Also see --jumbo. Word in bold.

--jumbo

@@ -46,6 +35,6 @@

span.emphasis

Text marked to be emphasized.

p.roffit

-

Used for advertising final paragraph.

Www

+

Used for the advertising final paragraph.

Www

http://daniel.haxx.se/projects/roffit

This HTML page was made with roffit.