Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing an ISO collection that encodes ISO bibliographic data (ISO SMRL) #1217

Open
stuartgalt opened this issue Aug 14, 2024 · 24 comments
Open
Assignees

Comments

@stuartgalt
Copy link

I need to be able to publish a new document type that does not have a "document number" but is an ISO product that is a collection of ISO 10303 documents. The title page is a bit different it looks like

image

Perhaps the SUMA architecture may be applied to assist building this collection document type.

@opoudjis
Copy link
Contributor

From the perspective of Metanorma, it looks like a collection with a distinct coverpage, rather than a distinct ISO document type. That makes it more straightforward: a distinct ISO document type would be a lot more involved (since we would be policing the metadata, and making sure it ends up in Word output). But I'm going to defer to @ronaldtse on this one.

@ronaldtse
Copy link
Contributor

@opoudjis For the SRL it is indeed a collection with a distinct coverpage, but also distinct bibliographic metadat.

You can see it here:

Screenshot 2024-09-29 at 3 58 56 PM

It does have an ISO bibliographic entry:

  • Edition number
  • Publication year
  • Title
  • Identifier
  • ICS
  • Price group

@ronaldtse
Copy link
Contributor

For Metanorma, the key here is to encode the ISO bibliographic item for this collection. Maybe this is just what "collection metadata" includes.

@ronaldtse ronaldtse changed the title New ISO document type Publishing an ISO collection that encodes ISO bibliographic data (ISO SMRL) Sep 29, 2024
@ronaldtse ronaldtse removed their assignment Sep 29, 2024
@ronaldtse ronaldtse transferred this issue from metanorma/metanorma Sep 29, 2024
@opoudjis
Copy link
Contributor

Metanorma collection manifests already are pretty much normal Metanorma document bibdata.

@ronaldtse
Copy link
Contributor

Can we make them identical? I.e. embed a bibdata inside collection.yml.

@opoudjis
Copy link
Contributor

opoudjis commented Sep 30, 2024

Can we make them identical? I.e. embed a bibdata inside collection.yml.

When I say "are pretty much normal Metanorma document bibdata", I actually mean "are identical to normal Metanorma document bibdata". That embedding is already implemented: the manifest parser presupposes Relaton YAML deserialisation.

@opoudjis opoudjis moved this from 🏔 High priority to 🏗 In progress in Metanorma Oct 15, 2024
@opoudjis
Copy link
Contributor

This task seems to be restricted to:

  • Populating the bibdata for the collection (and providing guidance to @manuelfuenmayor to propagate that into Liquid)
  • Providing a cover page Liquid template that renders the bibliographic data in the expected places

@ronaldtse Correct?

@opoudjis
Copy link
Contributor

opoudjis commented Oct 25, 2024

No response from @ronaldtse @stuartgalt @TRThurman

So, the bibdata representation of Relaton metadata in YAML is given in https://www.relaton.org/specs/relaton-yaml/ . I am not happy with the fact that this representation cleaves so close to the internal representation of the bibliographic resource object, to an extent that the XML does not.

I don't see why it is my job to provide metadata here, when that is an editorial task, but the YAML for https://www.iso.org/publication/PUB100485.html looks like:

bibdata:
  title:
    - type: main
      content: STEP Module and Resource Library (SMRL)
  type: standard
  edition: "11"
  docid:
    type: SMRL
    id: 2024(E)
  flavor: iso
  date:
    - type: published
      "on": 2024 
  ext:
    ics: 25.040.40
    price-code: XL
  copyright:
     owner:
       name: International Organization for Standardization
       abbreviation: ISO
       url: www.iso.org
     from: '2024'
  contributor:
    - organization:
        name: International Organization for Standardization
        url: www.iso.org
        abbreviation: ISO
      role:
        type: publisher

I will work out next how to get this into cover.html so that it looks like the current cover page.

I'm not sure that price-code will be recognised at all; it is currently only supported in IEC and not ISO.

@opoudjis
Copy link
Contributor

opoudjis commented Oct 25, 2024

I'm creating a branch of iso-10303 to work this out, as collection-bibdata.yml.

The extension is being ignored:

<bibdata type="standard" schema-version="v1.2.9">  
  <title type="main" format="text/plain">STEP Module and Resource Library (SMRL)</title>
  <docidentifier type="SMRL">2024(E)</docidentifier>
  <date type="published">
    <on>2024-01-01</on>
  </date>
  <contributor>
    <role type="publisher"/>
    <organization>
      <name>International Organization for Standardization</name>
      <abbreviation>ISO</abbreviation>
      <uri>www.iso.org</uri>
    </organization>
  </contributor>
  <edition>11</edition>
  <copyright>
    <from>2024</from>
    <owner>
      <organization>
        <name>International Organization for Standardization</name>
        <abbreviation>ISO</abbreviation>
        <uri>www.iso.org</uri>
      </organization>
    </owner>
  </copyright>
</bibdata>

@opoudjis
Copy link
Contributor

opoudjis commented Oct 25, 2024

@andrew2net Does Relaton YAML recognise bibdata/ext, given that its contents are potentially ad-hoc? if they do not, we will need to add that functionality, so that we can pass in content like ext/ics. Let me know if that's the case, and I'll make a ticket.

We would also need a syntax to permit arbitrary XML attributes on values, to be consistent with what we have (occasionally) already put in place in bibdata extensions: the convention per https://www.site24x7.com/tools/xml-to-yaml.html appears to be a hash of the keys "-{attribute}" and "#text"

@ronaldtse
Copy link
Contributor

@opoudjis it is probably better to map to YAML-LD/JSON-LD if we're talking about namespaces etc.

@opoudjis
Copy link
Contributor

... Where did I talk about namespaces?

@ronaldtse
Copy link
Contributor

So it's not "namespaces"... I was thinking of differentiating attributes vs elements in the XML sense. Attributes are labeled @{name} instead of elements as {name} in JSON-LD. And in JSON-LD they support the namespaces in the XML sense.

@opoudjis
Copy link
Contributor

We use XML to exchange data, and we use YAML to configure data. I am not going to introduce a third format, because I am not interested in proliferating structures for the hell of it. If I wanted a non-human-readable format for config, I would stick to XML.

@ronaldtse
Copy link
Contributor

@opoudjis it's not exactly like that. Relaton YAML is used for data entry and exchange of Relaton data.

@opoudjis
Copy link
Contributor

You are getting distracted, and/or you are trying to argue something tangential to what I am saying.

  • We configure metadata of collections in YAML
  • We configure metadata of collections in YAML, because we want humans to write the config files
  • It would catastrophic to suggest humans author JSON-LD for collection configuration instead of YAML.
  • It would be pointless to do so, while XML is still supported by our infrastructure for use cases that do not involve human readability
  • We need the ability to enter random idiosyncratic extensions to bibliographic metadata that are not predefined
  • We need to support XML attributes, in case that includes an extension that has already been encoded in XML with attributes in a flavour. I am trying to avoid that, but it can happen
  • We need humans to enter such bibliographic metadata in Relaton YAML
  • We are simply NOT going to abandon Relaton YAML in favour of Relaton JSON-LD, because we want human-readable config

... This really should be obvious. If Relaton XML is to never have attributes in its extensions, that will need some significant refactoring. (And with the ongoing dumpster fire of lutaml-model porting, that's the last thing we need.) But while it does have attributes in its extensions, I need the ability to have "-{attribute}" and "#text" in Relaton YAML.

I will add that I have FAR more certainty about what to put in Relaton XML, because Relaton XML has a grammar, and Relaton YAML doesn't. That is, in fact, a problem, and it's a problem you have also run into.

@opoudjis
Copy link
Contributor

price-code is not being recognised at the moment in Relaton YAML; investigating whether that is a bug.

The rest of the bibliographic content, including ICS, can be processed; I now need to see about extracting that information into a Liquid template for the cover page.

@opoudjis
Copy link
Contributor

Exposing bibdata as an object to Liquid.

@opoudjis
Copy link
Contributor

I'm generating the following coverpage based on collection metadata:

Note that this is a single page. Metanorma is not set up right now to generate two collection cover pages, and I'm not sure it should be: I believe it makes more sense to populate a single cover artefact through metadata in Metanorma, and break that up into two files in suma.

This is the input cover html file:

<html><head><meta charset="UTF-8"/></head><body><nav>{{ navigation }}</nav></body></html>

<hr/>

<html><head><meta charset="UTF-8"/></head><body>

<p align=center><img src="https://www.iso.org/files/live/sites/isoorg/files/name_and_logo/Final_ISO_Grey-2015-Registered-sign.png"/></p>

<h1 align=center>{{ docnumber }}</h1>
<h1 align=center>{{ doctitle }}</h1>

{% for title in bibdata.title %}
{% if title.language.first == "fr" %}
<p align=center><i>{{ title.content}}</i></p>
{%endif%}{%endfor%}

      <p align=center><b>Version {{ bibdata.edition.content}}: {% for date in bibdata.date %}{% if date.type == "published" %}{{date.value}}{%endif%}{%endfor%}</b></p>
<hr/>

<table width=100%><tr>
<td valign=top width=30%><b>ICS {{ bibdata.ics | map: "code" | join: ", " }}</b></td>
<td valign=top width=40%>© ISO {{ bibdata.copyright.first.from}}<br/>
<p>All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.</p>

<p>ISO Copyright Office<br/>
CP 401 • Ch. de Blandonnet 8<br/>
CH-1214 Vernier, Geneva    <br/>
Phone: +41 22 749 01 11   <br/>
Fax: +41 22 749 09 47    <br/>
Email: [email protected]<br/>
Website: www.iso.org   <br/>
Published in Switzerland</p>
<td valign=top width=30%>Price group: {{ bibdata.price_group }} </td>
</tr></table>

    </body></html>

and this is what gets populated as a result of running metanorma collection and using its bibdata element. Note that price code is not yet recognised:

Screenshot 2024-10-29 at 23 09 11

In my opinion, that's all that's needed here. Suma can break up that HTML into its two component HTML files, and link one to the other (by inserting a link from the docid).

@opoudjis opoudjis moved this from 🏗 In progress to 👀 In review in Metanorma Oct 29, 2024
@ronaldtse
Copy link
Contributor

Exposing bibdata as an object to Liquid.

Can we expose it using Liquid::Drop? Is this to be done in Relaton gem proper?

@opoudjis
Copy link
Contributor

opoudjis commented Oct 31, 2024

The bibdata has been recovered from the config file, via Relaton on Shale/Lutaml-Model processing of the manifest, and is passed to Liquid as an object: { bibdata: @bibdata.to_hash } That hash processing has indeed been done in Relaton. There is nothing further to be done in Relaton.

The hash format of Relaton data already exists (it is what the YAML is processed in, for God's sake), and since Relaton is already ingested as an object (that was the entire point of your insistence that Metanorma config adopt Shale and then Lutaml-Model, which I have spent so many months on). I therefore regard the suggestion that I adopt a brand new deserialisation of Liquid Just Because as yet another reinvention of wheels, for no perceptible purpose (it is a single record, and it's already a hash, this would make the code more not less efficient), and I will not be actioning it.

If you ask me to deserialise bibdata into a hash, and THEN ask me to deserialise a DOM object into a Liquid Drop WHEN I HAVE ALREADY DESERIALISED THAT DOM OBJECT INTO A HASH AT YOUR REQUEST, then you are not keeping track of the architecture.

@ronaldtse
Copy link
Contributor

No no, this is a misunderstanding.

A Hash of the Relaton object is a new object, not the object itself, and requires resources (memory, processing) to convert.

A Liquid::Drop of a Relaton object (which doesn't exist yet) is a wrapper around that object for Liquid access that is much faster to process (less conversion needed).

The speed difference was proven by @kwkwan which is several hours to several minutes difference in the "xmi" gem.

Technically, the only change would be "relaton.to_hash" vs "relaton.to_drop". But the latter doesn't exist yet. Maybe the latter can be a Lutaml-model enhancement.

@opoudjis
Copy link
Contributor

opoudjis commented Nov 2, 2024

It's a single record, the time spent is minimal (because it is a single record), and your drop object does not even currently exist. Metanorma manifests do not need this.

Find a use case that actually does. Publishing bibliographies in relaton-cli, perhaps.

But metanorma is not a guinea pig, I have enough to do, and this is not solving a real problem. When collections take 2 hours to compile, asking me to save the 0.1 sec spent on a single hash in Liquid is silly.

@opoudjis
Copy link
Contributor

This change to metanorma is being merged, and it enables passing generic bibdata into a cover page. If and when people review the PR in https://github.com/metanorma/iso-10303/pull/404 , work can resume.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 👀 In review
Development

No branches or pull requests

3 participants