<-- back to the mailing list
Metadata Without A Proposal
Philip Linde linde.philip at gmail.com
Fri Feb 26 12:16:31 GMT 2021
On Fri, 26 Feb 2021 11:51:08 +0100nothien at uber.space wrote:
I've lost track of the currently raging metadata thread entirely, and so
I've started this as a new post.
Thus far, I think there's general consensus on the following needs for
any metadata proposal:
1. Must degrade gracefully for clients that don't understand metadata.
2. Must not be English-specific.
What is the preferable alternative? We could use numbers to indicateelement type, but ultimately numbers are dependent on numeral systems,which depend on language and culture.
If instead of using English directly, we define opaque strings ofcharacters for the tags, such that the tag "author" consistently means"author", we really achieve the same thing. That is a simple solutionthat is language independent.
Or we could use emoji, although I believe most computer users in theworld would have a harder time typing out a given emoji than a givenopaque, ASCII- and English-compatible string.
3. Must be machine-parsable.
We should consider the difference between needs and wants here. If Ihave no interest in specifying another license to use my work than whatis implied from my sharing it, that doesn't necessarily mean I don'twant to specify date or author, so perhaps all or most elements shouldbe optional.
4. Should affect presentation.
gemtext as a whole is about separating content from presentation.
Some of the earlier metadata proposals referred to metadata for
presentation, e.g. to specify a color to view the text in. This is
against the spirit of gemtext/Gemini (if not the spec).
Agreed, but as I understand it you do *not* want it to affectpresentation.
5. Must be difficult to extend.
Again, this comes from the general Gemini philosophy that anything
that can be misused will be misused. This rules out lots of current
proposals because they specify tags, and the usage of tags can only be
controlled by convention, which is subject to change.
What do you propose that prevents conventional use from dictatingreality? And why is it important that the specification can not beextended? Unlike e.g. text/gemini, if a client doesn't support somesuperset of the tags initially specified, there is no degradation. Ifin the future we want to extend a meta data format to support e.g.specifying where, in addition to when, it was written, the clients thatdon't support it shouldn't suffer from it.
The only important concern to me is that there is a canonicaldescription of tags. That description can be extended indefinitely asfar as I'm concerned, for as long as the original meanings of theinitial set of supported tags aren't changed or overloaded by newertags.
6. Must be accessible.
Some proposals discussed the usage of emojis, and others have opted
for creating new unofficial line types. These don't degrade
gracefully for things like screen readers, until they adopt the
metadata proposal. That's not great.
I think that instead of defining ourselves what fields are important weshould start from a standard, e.g. DCMI with the element set defined inIETF RFC 5013.
With that as a basis, if there is no suitable format already, we candefine a human readable, text-compatible data format and a correspondingtext/xyz MIME type. Then, a text/gemini document that feels likesupplying additional metadata can link to a metadata file which theserver serves with the above MIME type. A client that does not supportthe MIME type should defer to serving unknown text/* types as plaintext. A client that does support it can localize the elements, includingthings like names and date and time formats. If the client is acrawler, it should find the linked metadata document as a matter of itsnormal operation because it is linked from the document.
Such formats already exist, but there is little interest in authoringsuch files.
In that way, no extension or change to Gemini is necessary. Nospecialized sub-formats for existing line types either.
Personally I don't think this is a standard I would use either way.It's mostly for the benefit of robots that there's a point informalizing information like this. Humans can interpret suchinformation as indicated in the document itself in a much wider varietyof formats. It's not my intention, primarily, to serve robots.
-- Philip-------------- next part --------------A non-text attachment was scrubbed...Name: not availableType: application/pgp-signatureSize: 488 bytesDesc: not availableURL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210226/5de292a7/attachment.sig>