Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roffit: fix special characters and broken links #7

Merged
merged 7 commits into from
Jul 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 10 additions & 12 deletions roffit
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ my $hrefdir=".";
my $filename;
my $leavecase;

my $htmlentity = "(#)|(&)|(')|(<)|(>)"; # html entities to anchors
my $htmlentity = "(<)|(>)|(&)|(')|(#)"; # html entities by do_encode
my $field; # defined for global use

while($ARGV[0]) {
Expand Down Expand Up @@ -169,13 +169,10 @@ sub text2name {

# fixes special character anchors/anchor links while keeping text anchor/anchor links intact
sub field_anchor {
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)($htmlentity)(,|<\/span>)/$1\"-$4\"$2$3$4/g |
$field =~ s/<span (Class=\"emphasis\")>-($htmlentity)(,|<\/span>)/<a $1 href=\"#-$2\">-$2/gi;
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)($htmlentity)/$1\"-$4\"$2$3$4/g;
$field =~ s/<span (Class=\"emphasis\")>-($htmlentity)/<a $1 href=\"#-$2\">-$2/gi;
$field =~ s/&#35;/hash/ | $field =~ s/&amp;/ampersand/ | $field =~ s/&#39;/single-quote/ |
$field =~ s/&lt;/less-than/ | $field =~ s/&gt;/greater-than/;
if ($1 == 1) {
$field =~ s/<\/span>/<\/a>/; # close as link
}
$field =~ s/&lt;/less-than/ | $field =~ s/&gt;/greater-than/;
}

# scan through the file and check for <span> sections we should convert
Expand Down Expand Up @@ -277,9 +274,9 @@ sub linkfile {

my $specialcharacters = "`~!@\$%^*()-_=+{};:\'\\|,.?"; # define options for special character
# start process to fix options with special characters
# non html entities first
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)([$specialcharacters]|[\[\]])(,|<\/span>)/$1\"-$4\"$2$3$4/g;
$field =~ s/(<a class=\"emphasis\") href=\"#-\">-([$specialcharacters]|[\[\]])(,|<\/a>)/$1 href=\"#-$2\">-$2/g;
# non html entities first
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)([$specialcharacters]|[\[\]])((,)|(<\/span>))/$1\"-$4\"$2$3$4$5/g;
$field =~ s/(<a class=\"emphasis\") href=\"#-\">-([$specialcharacters]|[\[\]])((,)|(<\/a>))/$1 href=\"#-$2\">-$2$4/g;

# fix html entities
# add anchors when conditions is met as few to no matches expected
Expand All @@ -288,7 +285,8 @@ sub linkfile {
}
# add anchor links to html entities without removing current anchor links
if ($field =~ /<span (Class=\"emphasis\")>-($htmlentity)(,|<\/span>)/) {
field_anchor(1); # closing tag to </a>
field_anchor(); # closing tag to </a>
$field =~ s/<\/span>/<\/a>/; # close as link
}

# convert (uppercase only) "RFC [number]" to a link
Expand Down Expand Up @@ -680,4 +678,4 @@ ROFFIT

if($standalone) {
print "</body></html>\n";
}
}
4 changes: 2 additions & 2 deletions testpage.output
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
<p class="level0">When the protocols are declared within punctuations, but have no domain they remain as an &lt;a&gt; element with the href attribute removed. For example: </p>
<p class="level0">(&quot;<a>http://</a>&quot;), (<a>https://</a>), and &quot;<a>ftp://</a>&quot;; are &lt;a&gt; elements, while unpunctuated http:// or https:// or ftp:// are inline text within &lt;p&gt; element. </p>
<p class="level0">Additionally special character options will have anchor links. So if an option similar to the one below is documented; it will have an anchor and anchor link: </p>
<p class="level0"><a name="-?"></a><span class="nroffip">-? --special-char</span> </p>
<p class="level1">Options with special characters will be included in anchors. Such as this option with special character <a class="emphasis" href="#-?">-? --special-char</a>. </p>
<p class="level0"><a name="-?"></a><span class="nroffip">-?, --special-char</span> </p>
<p class="level1">Options with special characters will be included in anchors. Such as this option with special character <a class="emphasis" href="#-?">-?, --special-char</a>. </p>
<p class="level1">Also, we must support <span class="emphasis">style staring on one line and ending on another that may</span> be multiple lines off. </p><a name="OPTIONS"></a><h2 class="nroffsh">Options</h2>
<p class="level0"><a name="--bare"></a><span class="nroffip">--bare</span> </p>
<p class="level1">The output HTML will not include any HTML, HEAD or BODY tags. Also note that when this is selected, there will be no inlined CSS but you will have to define the necessary classes yourself. </p>
Expand Down