Skip to content

Commit

Permalink
roffit: fix special characters and broken links (#7)
Browse files Browse the repository at this point in the history
* roffit: roffit

Additional edits to regular expression for special character options.

Removed condition in field_anchor subroutine, including the one line needed in "linkfile" subroutine; to reduce line numbers.

Regarding suggested edits:
Unable to achieve fix with text2name and do_encode. 
In short - new things break. 

Quick takeaway (should be => becomes):
name="--any-option" => name=""--any-option" ("" at start)
name="ANY_NAME" => name="ANYNAME" (_ removed)

Detailed report (explains above):
Since curl.1 is a good use case doing a local test with roffit I
made two files:
   1. with current badger roffit
   2. with (multiple) edited roffit
and ran "diff" command on them: 

___
One example case remove " in do_encode where:
sub do_encode($) {
    return encode_entities(shift, q{<>&'#});
}

results in:
name=""--any-option" (2 "" at start)
___

___
One example case add special characters in text2name:
sub text2name { .....
   $text =~ s/[^a-zA-Z0-9-`~!@\$%^*()-_=+{};:\'\\|,.?]//g;
   ...}

results in:
name="socks5h://" (should be socks5h).
or 
name="AUNDERSCORE" (shoud be A_UNDERSCORE).
___

When 2. with current pull roffit; no new things were broken.

* roffit: testpage.output

Generated testpage with bug fix from roffit.

* roffit: testpage.1

Double checked. Removed space character at end of file.
  • Loading branch information
jhauga authored Jul 21, 2023
1 parent 9d6a975 commit d510204
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 14 deletions.
22 changes: 10 additions & 12 deletions roffit
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ my $hrefdir=".";
my $filename;
my $leavecase;

my $htmlentity = "(&#35;)|(&amp;)|(&#39;)|(&lt;)|(&gt;)"; # html entities to anchors
my $htmlentity = "(&lt;)|(&gt;)|(&amp;)|(&#39;)|(&#35;)"; # html entities by do_encode
my $field; # defined for global use

while($ARGV[0]) {
Expand Down Expand Up @@ -169,13 +169,10 @@ sub text2name {

# fixes special character anchors/anchor links while keeping text anchor/anchor links intact
sub field_anchor {
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)($htmlentity)(,|<\/span>)/$1\"-$4\"$2$3$4/g |
$field =~ s/<span (Class=\"emphasis\")>-($htmlentity)(,|<\/span>)/<a $1 href=\"#-$2\">-$2/gi;
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)($htmlentity)/$1\"-$4\"$2$3$4/g;
$field =~ s/<span (Class=\"emphasis\")>-($htmlentity)/<a $1 href=\"#-$2\">-$2/gi;
$field =~ s/&#35;/hash/ | $field =~ s/&amp;/ampersand/ | $field =~ s/&#39;/single-quote/ |
$field =~ s/&lt;/less-than/ | $field =~ s/&gt;/greater-than/;
if ($1 == 1) {
$field =~ s/<\/span>/<\/a>/; # close as link
}
$field =~ s/&lt;/less-than/ | $field =~ s/&gt;/greater-than/;
}

# scan through the file and check for <span> sections we should convert
Expand Down Expand Up @@ -277,9 +274,9 @@ sub linkfile {

my $specialcharacters = "`~!@\$%^*()-_=+{};:\'\\|,.?"; # define options for special character
# start process to fix options with special characters
# non html entities first
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)([$specialcharacters]|[\[\]])(,|<\/span>)/$1\"-$4\"$2$3$4/g;
$field =~ s/(<a class=\"emphasis\") href=\"#-\">-([$specialcharacters]|[\[\]])(,|<\/a>)/$1 href=\"#-$2\">-$2/g;
# non html entities first
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)([$specialcharacters]|[\[\]])((,)|(<\/span>))/$1\"-$4\"$2$3$4$5/g;
$field =~ s/(<a class=\"emphasis\") href=\"#-\">-([$specialcharacters]|[\[\]])((,)|(<\/a>))/$1 href=\"#-$2\">-$2$4/g;

# fix html entities
# add anchors when conditions is met as few to no matches expected
Expand All @@ -288,7 +285,8 @@ sub linkfile {
}
# add anchor links to html entities without removing current anchor links
if ($field =~ /<span (Class=\"emphasis\")>-($htmlentity)(,|<\/span>)/) {
field_anchor(1); # closing tag to </a>
field_anchor(); # closing tag to </a>
$field =~ s/<\/span>/<\/a>/; # close as link
}

# convert (uppercase only) "RFC [number]" to a link
Expand Down Expand Up @@ -680,4 +678,4 @@ ROFFIT

if($standalone) {
print "</body></html>\n";
}
}
4 changes: 2 additions & 2 deletions testpage.output
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
<p class="level0">When the protocols are declared within punctuations, but have no domain they remain as an &lt;a&gt; element with the href attribute removed. For example: </p>
<p class="level0">(&quot;<a>http://</a>&quot;), (<a>https://</a>), and &quot;<a>ftp://</a>&quot;; are &lt;a&gt; elements, while unpunctuated http:// or https:// or ftp:// are inline text within &lt;p&gt; element. </p>
<p class="level0">Additionally special character options will have anchor links. So if an option similar to the one below is documented; it will have an anchor and anchor link: </p>
<p class="level0"><a name="-?"></a><span class="nroffip">-? --special-char</span> </p>
<p class="level1">Options with special characters will be included in anchors. Such as this option with special character <a class="emphasis" href="#-?">-? --special-char</a>. </p>
<p class="level0"><a name="-?"></a><span class="nroffip">-?, --special-char</span> </p>
<p class="level1">Options with special characters will be included in anchors. Such as this option with special character <a class="emphasis" href="#-?">-?, --special-char</a>. </p>
<p class="level1">Also, we must support <span class="emphasis">style staring on one line and ending on another that may</span> be multiple lines off. </p><a name="OPTIONS"></a><h2 class="nroffsh">Options</h2>
<p class="level0"><a name="--bare"></a><span class="nroffip">--bare</span> </p>
<p class="level1">The output HTML will not include any HTML, HEAD or BODY tags. Also note that when this is selected, there will be no inlined CSS but you will have to define the necessary classes yourself. </p>
Expand Down

0 comments on commit d510204

Please sign in to comment.