Network Working GroupM. Nottingham
Internet-DraftNovember 25, 2016
Obsoletes: 5988 (if approved)
Intended status: Standards Track
Expires: May 29, 2017

Web Linking

draft-nottingham-rfc5988bis-03

Abstract

This specification defines a way to indicate the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

It also defines the serialisation of such links in HTTP headers with the Link header field.

Note to Readers

This is a work-in-progress to revise RFC5988.

The issues list can be found at https://github.com/mnot/I-D/labels/rfc5988bis.

The most recent (often, unpublished) draft is at https://mnot.github.io/I-D/rfc5988bis/.

Recent changes are listed at https://github.com/mnot/I-D/commits/gh-pages/rfc5988bis.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress”.

This Internet-Draft will expire on May 29, 2017.

Copyright Notice

Copyright © 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

1. Introduction

This specification defines a way to indicate the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

HTML [W3C.REC-html5-20141028] and Atom [RFC4287] both have well-defined concepts of linking; this specification generalises this into a framework that encompasses linking in these formats and (potentially) elsewhere.

Furthermore, this specification formalises an HTTP header field for conveying such links, having been originally defined in Section 19.6.2.4 of [RFC2068], but removed from [RFC2616].

2. Notational Conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14, [RFC2119], as scoped to those conformance targets.

This document uses the Augmented Backus-Naur Form (ABNF) notation of [RFC7230], including the #rule, and explicitly includes the following rules from it: quoted-string, token, SP (space), BWS (bad whitespace), OWS (optional whitespace), RWS (required whitespace) LOALPHA, DIGIT.

Additionally, the following rules are included from [RFC3986]: URI and URI-Reference; from [RFC6838]: type-name and subtype-name; from [W3C.CR-css3-mediaqueries-20090915]: media_query_list; and from [RFC5646]: Language-Tag..

5. Target Attributes

Target attributes are a set of key/value pairs that describe the link or its target; for example, a media type hint.

This specification does not attempt to coordinate the name of target attributes, their cardinality or use; they are defined both by individual link relations and by link serialisations.

Serialisations SHOULD coordinate their target attributes to avoid conflicts in semantics or syntax. Relation types MAY define additional target attributes specific to them.

The names of target attributes SHOULD conform to the token rule, but SHOULD NOT include any of the characters “%”, “’” or “*”, for portability across serializations, and MUST be compared in a case-insensitive fashion.

Target attribute definitions SHOULD specify:

  • Their serialisation into Unicode or a subset thereof, to maximise their chances of portability across link serialisations.
  • The semantics and error handling of multiple occurrences of the attribute on a given link.

This specification does define target attributes for use in the Link HTTP header field in Section 6.4.

7. IANA Considerations

In addition to the actions below, IANA should terminate the Link Relation Application Data Registry, as it has not been used, and future use is not anticipated.

8. Security Considerations

The content of the Link header field is not secure, private or integrity-guaranteed, and due caution should be exercised when using it. Use of Transport Layer Security (TLS) with HTTP ([RFC2818] and [RFC2817]) is currently the only end-to-end way to provide such protection.

Link applications ought to consider the attack vectors opened by automatically following, trusting, or otherwise using links gathered from HTTP headers. In particular, Link header fields that use the “anchor” parameter to associate a link’s context with another resource should be treated with due caution.

The Link header field makes extensive use of IRIs and URIs. See [RFC3987] for security considerations relating to IRIs. See [RFC3986] for security considerations relating to URIs. See [RFC7230] for security considerations relating to HTTP headers.

9. Internationalisation Considerations

Link targets may need to be converted to URIs in order to express them in serialisations that do not support IRIs. This includes the Link HTTP header field.

Similarly, the anchor parameter of the Link header field does not support IRIs, and therefore IRIs must be converted to URIs before inclusion there.

Relation types are defined as URIs, not IRIs, to aid in their comparison. It is not expected that they will be displayed to end users.

Note that registered Relation Names are required to be lower-case ASCII letters.

10. References

10.1 Normative References

[I-D.ietf-httpbis-rfc5987bis]
Reschke, J., “Indicating Character Encoding and Language for HTTP Header Field Parameters”, Internet-Draft draft-ietf-httpbis-rfc5987bis-03 (work in progress), July 2016.
[RFC2119]
Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.
[RFC3864]
Klyne, G., Nottingham, M., and J. Mogul, “Registration Procedures for Message Header Fields”, BCP 90, RFC 3864, DOI 10.17487/RFC3864, September 2004, <http://www.rfc-editor.org/info/rfc3864>.
[RFC3986]
Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax”, STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, <http://www.rfc-editor.org/info/rfc3986>.
[RFC3987]
Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs)”, RFC 3987, DOI 10.17487/RFC3987, January 2005, <http://www.rfc-editor.org/info/rfc3987>.
[RFC5226]
Narten, T. and H. Alvestrand, “Guidelines for Writing an IANA Considerations Section in RFCs”, BCP 26, RFC 5226, DOI 10.17487/RFC5226, May 2008, <http://www.rfc-editor.org/info/rfc5226>.
[RFC5646]
Phillips, A., Ed. and M. Davis, Ed., “Tags for Identifying Languages”, BCP 47, RFC 5646, DOI 10.17487/RFC5646, September 2009, <http://www.rfc-editor.org/info/rfc5646>.
[RFC6838]
Freed, N., Klensin, J., and T. Hansen, “Media Type Specifications and Registration Procedures”, BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, <http://www.rfc-editor.org/info/rfc6838>.
[RFC7230]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing”, RFC 7230, DOI 10.17487/RFC7230, June 2014, <http://www.rfc-editor.org/info/rfc7230>.
[RFC7231]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content”, RFC 7231, DOI 10.17487/RFC7231, June 2014, <http://www.rfc-editor.org/info/rfc7231>.
[W3C.CR-css3-mediaqueries-20090915]
Lie, H., Çelik, T., Glazman, D., and A. Kesteren, “Media Queries”, World Wide Web Consortium CR CR-css3-mediaqueries-20090915, September 2009, <http://www.w3.org/TR/2009/CR-css3-mediaqueries-20090915>.

10.2 Informative References

[RFC2068]
Fielding, R., Gettys, J., Mogul, J., Frystyk, H., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1”, RFC 2068, DOI 10.17487/RFC2068, January 1997, <http://www.rfc-editor.org/info/rfc2068>.
[RFC2616]
Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1”, RFC 2616, DOI 10.17487/RFC2616, June 1999, <http://www.rfc-editor.org/info/rfc2616>.
[RFC2817]
Khare, R. and S. Lawrence, “Upgrading to TLS Within HTTP/1.1”, RFC 2817, DOI 10.17487/RFC2817, May 2000, <http://www.rfc-editor.org/info/rfc2817>.
[RFC2818]
Rescorla, E., “HTTP Over TLS”, RFC 2818, DOI 10.17487/RFC2818, May 2000, <http://www.rfc-editor.org/info/rfc2818>.
[RFC4287]
Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format”, RFC 4287, DOI 10.17487/RFC4287, December 2005, <http://www.rfc-editor.org/info/rfc4287>.
[W3C.REC-html-rdfa-20150317]
Sporny, M., “HTML+RDFa 1.1 - Second Edition”, World Wide Web Consortium Recommendation REC-html-rdfa-20150317, March 2015, <http://www.w3.org/TR/2015/REC-html-rdfa-20150317>.
[W3C.REC-html5-20141028]
Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Navara, E., O&#039;Connor, T., and S. Pfeiffer, “HTML5”, World Wide Web Consortium Recommendation REC-html5-20141028, October 2014, <http://www.w3.org/TR/2014/REC-html5-20141028>.
[W3C.REC-xml-names-20091208]
Bray, T., Hollander, D., Layman, A., Tobin, R., and H. Thompson, “Namespaces in XML 1.0 (Third Edition)”, World Wide Web Consortium Recommendation REC-xml-names-20091208, December 2009, <http://www.w3.org/TR/2009/REC-xml-names-20091208>.

B. Algorithm for Parsing Link Headers

Given a HTTP header field-value field_value as a string assuming ASCII encoding, the following algorithm can be used to parse it into the model described by this specification:

  1. Let links be an empty list.
  2. Create link_strings by splitting field_value on “,” characters, excepting “,” characters within quoted strings as per [RFC7230], Section 3.2.6, or which form part of link’s URI-Reference (i.e. between “<” and “>” characters where the “<” is immediately preceded by OWS and either a “,” character or the beginning of the field_value string).
  3. For each link_string in link_strings:
    1. Let target_string be the string between the first “<” and first “>” characters in link_string. If they do not appear, or do not appear in that order, fail parsing.
    2. Let rest be the remaining characters (if any) after the first “>” character in link_string.
    3. Split rest into an array of strings parameter_strings, on the “;” character, excepting “;” characters within quoted strings as per [RFC7230], Section 3.2.6.
    4. Let link_parameters be an empty array.
    5. For each item parameter in parameter_strings:
      1. Remove OWS from the beginning and end of parameter.
      2. Skip this item if parameter matches the empty string (“”).
      3. Split parameter into param_name and param_value on the first “=” character. If parameter does not contain “=”, let param_name be parameter and param_value be null.
      4. Remove OWS from the end of param_name and the beginning of param_value.
      5. Case-normalise param_name to lowercase.
      6. If the first and last characters of param_value are both DQUOTE:
        1. Remove the first and last characters of param_value.
        2. Replace quoted-pairs within param_value with the octet following the backslash, as per [RFC7230], Section 3.2.6.
      7. If the last character of param_name is an asterisk (“*”), decode param_value according to [I-D.ietf-httpbis-rfc5987bis]. Skip this item if an unrecoverable error is encountered.
      8. Append the tuple (param_name, param_value) to link_parameters.
    6. Let target be the result of relatively resolving (as per [RFC3986], Section 5.2) target_string. Note that any base URI carried in the payload body is NOT used.
    7. Let relations_string be the second item of the first tuple of link_parameters whose first item matches the string “rel”, or the empty string (“”) if it is not present.
    8. Split relations_string into an array of strings relation_types, on RWS (removing all whitespace in the process).
    9. Let context_string be the second item of the first tuple of link_parameters whose first item matches the string “anchor”. If it is not present, context_string is the identity of the representation carrying the Link header [RFC7231], Section 3.1.4.1, serialised as a URI. Where the identity is “anonymous” context_string is null.
    10. Let context be the result of relatively resolving (as per [RFC3986], Section 5.2) context_string, unless context_string is null in which case context is null. Note that any base URI carried in the payload body is NOT used.
    11. Let target_attributes be an empty array.
    12. For each tuple (param_name, param_value) of link_parameters:
      1. If param_name matches “rel” or “anchor”, skip this tuple.
      2. If param_name matches “media”, “title”, “title*” or “type” and target_attributes already contains a tuple whose first element matches the value of param_name, skip this tuple.
      3. Append (param_name, param_value) to target_attributes.
    13. Let star_param_names be the set of param_names in the (param_name, param_value) tuples of link_parameters where the last character of param_name is an asterisk (“*”).
    14. For each star_param_name in star_param_names:
      1. Let base_param_name be star_param_name with the last character removed.
      2. If the implementation does not choose to support an internationalised form of a parameter named base_param_name for any reason (including, but not limited to, it being prohibited by the parameter’s specification), remove all tuples from link_parameters whose first member is star_param_name and skip to the next star_param_name.
      3. Remove all tuples from link_parameters whose first member is base_param_name.
      4. Change the first member of all tuples in link_parameters whose first member is star_param_name to base_param_name.
    15. For each relation_type in relation_types:
      1. Case-normalise relation_type to lowercase.
      2. Append a link object to links with the target target, relation type of relation_type, context of context, and target attributes target_attributes.
  4. Return links.

C. Changes from RFC5988

This specification has the following differences from its predecessor, RFC5988:

  • The initial relation type registrations were removed, since they’ve already been registered by 5988.
  • The introduction has been shortened.
  • The Link Relation Application Data Registry has been removed.
  • Incorporated errata.
  • Updated references.
  • Link cardinality was clarified.
  • Terminology was changed from “target IRI” and “context IRI” to “link target” and “link context” respectively.
  • Made assigning a URI to registered relation types serialisation-specific.
  • Removed misleading statement that the link header field is semantically equivalent to HTML and Atom links.
  • More carefully defined how the Experts and IANA should interact.
  • More carefully defined and used “link serialisations” and “link applications.”
  • Clarified the cardinality of target attributes (generically and for “type”).
  • Corrected the default link context for the Link header field, to be dependent upon the identity of the representation (as per RFC7231).
  • Defined a suggested parsing algorithm for the Link header.
  • The value space of target attributes and their definition has been specified.
  • The ABNF has been updated to be compatible with [RFC7230]. In particular, whitespace is now explicit.
  • Some parameters on the HTTP header field can now appear as a token.
  • Handling of quoted strings is now defined by [RFC7230].
  • The type header field parameter now needs to be quoted (as token does not allow “/”).

Author's Address

Mark Nottingham
Email: mnot@mnot.net
URI: https://www.mnot.net/