Network Working GroupM. Nottingham
Internet-DraftSeptember 22, 2017
Obsoletes: 5988 (if approved)
Intended status: Standards Track
Expires: March 26, 2018

Web Linking

draft-nottingham-rfc5988bis-08

Abstract

This specification defines a model for the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

It also defines the serialisation of such links in HTTP headers with the Link header field.

Note to Readers

RFC EDITOR: please remove this section before publication

This is a work-in-progress to revise RFC5988.

The issues list can be found at https://github.com/mnot/I-D/labels/rfc5988bis.

The most recent (often, unpublished) draft is at https://mnot.github.io/I-D/rfc5988bis/.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress”.

This Internet-Draft will expire on March 26, 2018.

Copyright Notice

Copyright © 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

1. Introduction

This specification defines a model for the relationships between resources on the Web (“links”) and the type of those relationships (“link relation types”).

HTML [W3C.REC-html5-20141028] and Atom [RFC4287] both have well-defined concepts of linking; Section 2 generalises this into a framework that encompasses linking in these formats and (potentially) elsewhere.

Furthermore, Section 3 defines an HTTP header field for conveying such links.

1.1. Notational Conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 [RFC2119],[RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses the Augmented Backus-Naur Form (ABNF) notation of [RFC7230], including the #rule, and explicitly includes the following rules from it: quoted-string, token, SP (space), BWS (bad whitespace), OWS (optional whitespace), RWS (required whitespace) LOALPHA, DIGIT.

Additionally, the following rules are included from [RFC3986]: URI and URI-Reference; from [RFC6838]: type-name and subtype-name; from [W3C.REC-css3-mediaqueries-20120619]: media-query-list; and from [RFC5646]: Language-Tag.

1.2. Conformance and Error Handling

The requirements regarding conformance and error handling highlighted in [RFC7230], Section 2.5 apply to this document.

4. IANA Considerations

5. Security Considerations

The content of the Link header field is not secure, private or integrity-guaranteed. Use of Transport Layer Security (TLS) with HTTP ([RFC2818]) is currently the only end-to-end way to provide these properties.

Link applications ought to consider the attack vectors opened by automatically following, trusting, or otherwise using links gathered from HTTP header fields.

For example, Link header fields that use the “anchor” parameter to associate a link’s context with another resource cannot be trusted since they are effectively assertions by a third party that could be incorrect or malicious. Applications can mitigate this risk by specifying that such links should be discarded unless some relationship between the resources is established (e.g., they share the same authority).

Dereferencing links has a number of risks, depending on the application in use. For example, the Referer header [RFC7231] can expose information about the application’s state (including private information) in its value. Likewise, cookies [RFC6265] are another mechanism that, if used, can become an attack vector. Applications can mitigate these risks by carefully specifying how such mechanisms should operate.

The Link header field makes extensive use of IRIs and URIs. See [RFC3987] Section 8 for security considerations relating to IRIs. See [RFC3986] Section 7 for security considerations relating to URIs. See [RFC7230] Section 9 for security considerations relating to HTTP header fields.

6. Internationalisation Considerations

Link targets may need to be converted to URIs in order to express them in serialisations that do not support IRIs. This includes the Link HTTP header field.

Similarly, the anchor parameter of the Link header field does not support IRIs, and therefore IRIs must be converted to URIs before inclusion there.

Relation types are defined as URIs, not IRIs, to aid in their comparison. It is not expected that they will be displayed to end users.

Note that registered Relation Names are required to be lower-case ASCII letters.

7. References

7.1. Normative References

[I-D.ietf-httpbis-rfc5987bis]
Reschke, J., “Indicating Character Encoding and Language for HTTP Header Field Parameters”, Internet-Draft draft-ietf-httpbis-rfc5987bis-05 (work in progress), February 2017.
[RFC2119]
Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC3864]
Klyne, G., Nottingham, M., and J. Mogul, “Registration Procedures for Message Header Fields”, BCP 90, RFC 3864, DOI 10.17487/RFC3864, September 2004, <https://www.rfc-editor.org/info/rfc3864>.
[RFC3986]
Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax”, STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, <https://www.rfc-editor.org/info/rfc3986>.
[RFC3987]
Duerst, M. and M. Suignard, “Internationalized Resource Identifiers (IRIs)”, RFC 3987, DOI 10.17487/RFC3987, January 2005, <https://www.rfc-editor.org/info/rfc3987>.
[RFC5646]
Phillips, A., Ed. and M. Davis, Ed., “Tags for Identifying Languages”, BCP 47, RFC 5646, DOI 10.17487/RFC5646, September 2009, <https://www.rfc-editor.org/info/rfc5646>.
[RFC6838]
Freed, N., Klensin, J., and T. Hansen, “Media Type Specifications and Registration Procedures”, BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, <https://www.rfc-editor.org/info/rfc6838>.
[RFC7230]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing”, RFC 7230, DOI 10.17487/RFC7230, June 2014, <https://www.rfc-editor.org/info/rfc7230>.
[RFC7231]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content”, RFC 7231, DOI 10.17487/RFC7231, June 2014, <https://www.rfc-editor.org/info/rfc7231>.
[RFC8126]
Cotton, M., Leiba, B., and T. Narten, “Guidelines for Writing an IANA Considerations Section in RFCs”, BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-editor.org/info/rfc8126>.
[RFC8174]
Leiba, B., “Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words”, BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[W3C.REC-css3-mediaqueries-20120619]
Rivoal, F., “Media Queries”, World Wide Web Consortium Recommendation REC-css3-mediaqueries-20120619, June 2012, <http://www.w3.org/TR/2012/REC-css3-mediaqueries-20120619>.

7.2. Informative References

[RFC2046]
Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types”, RFC 2046, DOI 10.17487/RFC2046, November 1996, <https://www.rfc-editor.org/info/rfc2046>.
[RFC2818]
Rescorla, E., “HTTP Over TLS”, RFC 2818, DOI 10.17487/RFC2818, May 2000, <https://www.rfc-editor.org/info/rfc2818>.
[RFC4287]
Nottingham, M., Ed. and R. Sayre, Ed., “The Atom Syndication Format”, RFC 4287, DOI 10.17487/RFC4287, December 2005, <https://www.rfc-editor.org/info/rfc4287>.
[RFC6265]
Barth, A., “HTTP State Management Mechanism”, RFC 6265, DOI 10.17487/RFC6265, April 2011, <https://www.rfc-editor.org/info/rfc6265>.
[W3C.REC-html5-20141028]
Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Navara, E., O&#039;Connor, T., and S. Pfeiffer, “HTML5”, World Wide Web Consortium Recommendation REC-html5-20141028, October 2014, <http://www.w3.org/TR/2014/REC-html5-20141028>.

B. Algorithms for Parsing Link Header Fields

This appendix outlines a set of non-normative algorithms: for parsing the Link header(s) out of a header set, parsing a link header field value, and algorithms for parsing generic parts of the field value.

These algorithms are more permissive than the ABNF defining the syntax might suggest; the error handling embodied in them is a reasonable approach, but not one that is required. As such they are advisory only, and in cases where there is disagreement, the correct behaviour is defined by the body of this specification.

B.1. Parsing a Header Set for Links

This algorithm can be used to parse the Link header fields that a HTTP header set contains. Given a header_set of (string field_name, string field_value) pairs, assuming ASCII encoding, it returns a list of link objects.

  1. Let field_values be a list containing the members of header_set whose field_name is a case-insensitive match for “link”.
  2. Let links be an empty list.
  3. For each field_value in field_values:
    1. Let value_links be the result of Parsing A Link Field Value (Appendix B.2) from field_value.
    2. Append each member of value_links to links.
  4. Return links.

B.2. Parsing a Link Field Value

This algorithm parses zero or more comma-separated link-values from a Link header field. Given a string field_value, assuming ASCII encoding, it returns a list of link objects.

  1. Let links be an empty list.
  2. While field_value has content:
    1. Consume any leading OWS.
    2. If the first character is not “<”, return links.
    3. Discard the first character (“<”).
    4. Consume up to but not including the first “>” character or end of field_value and let the result be target_string.
    5. If the next character is not “>”, return links.
    6. Discard the leading “>” character.
    7. Let link_parameters, be the result of Parsing Parameters (Appendix B.3) from field_value (consuming zero or more characters of it).
    8. Let target be the result of relatively resolving (as per [RFC3986], Section 5.2) target_string. Note that any base URI carried in the payload body is NOT used.
    9. Let relations_string be the second item of the first tuple of link_parameters whose first item matches the string “rel”, or the empty string (“”) if it is not present.
    10. Split relations_string on RWS (removing it in the process) into a list of strings relation_types.
    11. Let context_string be the second item of the first tuple of link_parameters whose first item matches the string “anchor”. If it is not present, context_string is the URL of the representation carrying the Link header [RFC7231], Section 3.1.4.1, serialised as a URI. Where the URL is anonymous, context_string is null.
    12. Let context be the result of relatively resolving (as per [RFC3986], Section 5.2) context_string, unless context_string is null in which case context is null. Note that any base URI carried in the payload body is NOT used.
    13. Let target_attributes be an empty list.
    14. For each tuple (param_name, param_value) of link_parameters:
      1. If param_name matches “rel” or “anchor”, skip this tuple.
      2. If param_name matches “media”, “title”, “title*” or “type” and target_attributes already contains a tuple whose first element matches the value of param_name, skip this tuple.
      3. Append (param_name, param_value) to target_attributes.
    15. Let star_param_names be the set of param_names in the (param_name, param_value) tuples of link_parameters where the last character of param_name is an asterisk (“*”).
    16. For each star_param_name in star_param_names:
      1. Let base_param_name be star_param_name with the last character removed.
      2. If the implementation does not choose to support an internationalised form of a parameter named base_param_name for any reason (including, but not limited to, it being prohibited by the parameter’s specification), remove all tuples from link_parameters whose first member is star_param_name and skip to the next star_param_name.
      3. Remove all tuples from link_parameters whose first member is base_param_name.
      4. Change the first member of all tuples in link_parameters whose first member is star_param_name to base_param_name.
    17. For each relation_type in relation_types:
      1. Case-normalise relation_type to lowercase.
      2. Append a link object to links with the target target, relation type of relation_type, context of context, and target attributes target_attributes.
  3. Return links.

B.3. Parsing Parameters

This algorithm parses the parameters from a header field value. Given an ASCII string input, it returns a list of (string parameter_name, string parameter_value) tuples that it contains. input is modified to remove the parsed parameters.

  1. Let parameters be an empty list.
  2. While input has content:
    1. Consume any leading OWS.
    2. If the first character is not “;”, return parameters.
    3. Discard the leading “;” character.
    4. Consume any leading OWS.
    5. Consume up to but not including the first BWS, “=”, “;”, “,” character or end of input and let the result be parameter_name.
    6. Consume any leading BWS.
    7. If the next character is “=”:
      1. Discard the leading “=” character.
      2. Consume any leading BWS.
      3. If the next character is DQUOTE, let parameter_value be the result of Parsing a Quoted String (Appendix B.4) from input (consuming zero or more characters of it).
      4. Else, consume the contents up to but not including the first “;”, “,” character or end of input and let the results be parameter_value.
      5. If the last character of parameter_name is an asterisk (“*”), decode parameter_value according to [I-D.ietf-httpbis-rfc5987bis]. Continue processing input if an unrecoverable error is encountered.
    8. Else:
      1. Let parameter_value be an empty string.
    9. Case-normalise parameter_name to lowercase.
    10. Append (parameter_name, parameter_value) to parameters.
    11. Consume any leading OWS.
    12. If the next character is “,” or the end of input, stop processing input and return parameters.

B.4. Parsing a Quoted String

This algorithm parses a quoted string, as per [RFC7230], Section 3.2.6. Given an ASCII string input, it returns an unquoted string. input is modified to remove the parsed string.

  1. Let output be an empty string.
  2. If the first character of input is not DQUOTE, return output.
  3. Discard the first character.
  4. While input has content:
    1. If the first character is a backslash (“\”):
      1. Discard the first character.
      2. If there is no more input, return output.
      3. Else, consume the first character and append it to output.
    2. Else, if the first character is DQUOTE, discard it and return output.
    3. Else, consume the first character and append it to output.
  5. Return output.

C. Changes from RFC5988

This specification has the following differences from its predecessor, RFC5988:

  • The initial relation type registrations were removed, since they’ve already been registered by 5988.
  • The introduction has been shortened.
  • The Link Relation Application Data Registry has been removed.
  • Incorporated errata.
  • Updated references.
  • Link cardinality was clarified.
  • Terminology was changed from “target IRI” and “context IRI” to “link target” and “link context” respectively.
  • Made assigning a URI to registered relation types serialisation-specific.
  • Removed misleading statement that the link header field is semantically equivalent to HTML and Atom links.
  • More carefully defined and used “link serialisations” and “link applications.”
  • Clarified the cardinality of target attributes (generically and for “type”).
  • Corrected the default link context for the Link header field, to be dependent upon the identity of the representation (as per RFC7231).
  • Defined a suggested parsing algorithm for the Link header.
  • The value space of target attributes and their definition has been specified.
  • The ABNF has been updated to be compatible with [RFC7230]. In particular, whitespace is now explicit.
  • Some parameters on the HTTP header field can now appear as a token.
  • Parameters on the HTTP header can now be value-less.
  • Handling of quoted strings is now defined by [RFC7230].
  • The type header field parameter now needs to be quoted (as token does not allow “/”).

Author's Address

Mark Nottingham
Email: mnot@mnot.net
URI: https://www.mnot.net/