REST is… representation?

[Note: since writing this, my understanding of the “real” REST terminology is getting better, so I’m not sure a lot of this discussion is on mark. I would delete the post altogether and start over, except that the comments are actually better than the post. So I appreciate the comments, and I’ll let the post stand, to my shame to honor those comments …just heads-up.]

We’re in the process of creating a big data aggregator, which is going to be a massively distributed and parallelized data dump, where all our business units can just dump their data, and then that data will be available for them to run whatever map-reduce jobs they want. The problem I’m thinking about now is how to make that work, when the form of that data will vary from source to source, and even for a given source, it will vary over time.

Also, as our shop starts to put together all these disparate REST systems, every team is putting their own spin on the problems of URI definition and resource representation. It would be cool to pretend that we could just draft a standard, and it would be clear and perfect, and everyone would code to it strictly. But whoops — I work in the software industry, not in Nirvana. So that’s not going to happen.

Or to use another metaphor which is appropriately vaguely insulting, in this zoo, I work in the monkey house. At least, not anywhere that reason dominates.

So I’m starting to think about representation. We have transformation technology lying around that will help us get there, at the expense of CPU cycles, but it means laying out all the resources that span our business space in a sort of directed graph, and filling in the links with transformations. Do-able, but people internally are going to whine a lot. Which will be a side benefit.

So I started thinking about REST — representational state transfer. In kicking ideas around, I’m wondering whether it’s correct to call REST resource-based, when really it’s representation-based.

If I come to a REST service, it’s because I want to GET or PUT some data (ignoring the bad-boy POST and the troublesome DELETE for now). And I’m going to GET or PUT the data I’m interested in a particular representation.

That data will probably map to some underlying resource — but it might not. The data might map to something ephemeral, like a search result. A search result isn’t really a resource in the system, it’s the *representation* of some *work* that I just did. I *might* create a key for that search, shove it in a data store, and treat it as persistent, but odds are I won’t.

A resource might be mapped with multiple URLs, and it might have multiple representations. A resource might be included in the representation of another related resource.

Even if I’m grabbing data from another source — let’s say a table of user accounts — the service is going to decide how deep to drill to add related records to the representation. My representation might be a very deep rendering of the data, or it might be shallow, with links that tell me where to get the rest of it.

It might come back as XML with a strict XSD schema, or it might just come back as a JSON bundle and you take-what-you-get. It might be a raw binary stream.

There’s a many-to-many relationship between representations and resources — so why are we equating them?

The point is, I think that calling a REST service resource-based is misleading. In nice, simple systems serving up static resources like JPG images or HTML files, sure the representation maps directly and easily to the actual data in the data store. But if you step away from that and spend any CPU cycles at all “preparing” the data — even rendering columns in a table into JSON — then your interface is not resource-based anymore, it’s representation-based.

I haven’t thought this all the way through — obviously — but I’m having the nagging feeling that this is related to my problem with the big data dump, and all these disparate REST systems we’re building. I’m having the feeling that asking or even wishing for uniform representations is wrong-headed. That we should be thinking of these systems as a federation of *representation* services, not a federation of resource services.

Right now, I have the feeling that making that fine distinction will be liberating and make things easier, but I have to noodle on it some more.

Advertisements

5 Comments

Filed under REST

5 responses to “REST is… representation?

  1. Eric J. Bowman

    You’ve now left the REST reservation… but, your intentions are pure, so I’ll do my best to help.

    “If I come to a REST service, it’s because I want to GET or PUT some data. And I’m going to GET or PUT the data I’m interested in a particular representation.”

    When I come to a REST service, it’s because I’m looking for some hypertext which self-documents the API, so my client can be trained what to GET and where to PUT, etc.

    Since JSON isn’t capable of a self-documenting API, the service must return (X)HTML or SVG. Let’s say we have a resource named “chart” with an URL:

    http://example.org/chart

    Let’s say we dereference that URL with an ‘Accept: application/json’ header. We’ll get back a JSON representation of the underlying data making up the chart.

    Let’s say we dereference that URL with an ‘Accept: application/svg+xml’ header. We’ll get back an SVG representation of the underlying data, as some form of chart.

    (This is my take on what you mean by “having transformations laying around”, your output media-types may differ from those I’m suggesting.)

    Let’s say we dereference that URL with an ‘Accept: text/html’ header. We’ll get back an HTML representation of the underlying data, with the chart embedded as SVG, and a form underneath, with a textarea containing the underlying JSON. This could be the default representation, say a request has ‘Accept: text/html, application/svg+xml’ with identical q-values.

    So if the client is a Web Browser, it doesn’t need to ‘Accept: application/json’ at all. If the Browser supports SVG embedded in HTML, it can use either HTML 5 or XHR to PUT application/json from a textarea, without necessarily supporting JSON.

    The same holds for any client… you can’t really format JSON for an end-user (except to edit it), but you can send code to the client which renders something by consuming JSON — the client may need to ‘Accept: application/javascript’.

    The way to make my above example work, is to serve each possible representation of the /chart resource with ‘Vary: Accept’ and ‘Content-Location’ headers. For Content-Location, assign each variant its own URI to make it a resource in its own right.

    Now, when /chart returns ‘Content-Location: /chart.html’, the form it contains for updating the JSON raw data has @action=’/chart.json’. The textarea starts filled out with the original values, so you can edit them and PUT them back. If you don’t want to use PUT like that, then you should be using PATCH, if for example your JSON files are immensely huge.

    If you have a client that “just knows” that it can PUT to a resource which only has a JSON representation, then you have an HTTP API, not REST.

    If your API is hypertext-driven, such that you can GET SVG and PUT JSON (HTML may be skipped entirely by using Xforms in SVG) from/to the same resource/subresource due to late binding of representation to resource, then you’re back on the RESTervation — REST really is resource-based.

    If you’re looking to build a REST API entirely from JSON, then you have real problems, because JSON does not define linking, fragments, or forms like SVG and (X)HTML do. Or table structures, or vector graphics, etc. JSON works fine as underlying data, you just have to interact with it RESTfully.

    (Like using an HTML form to PUT JSON to some URL, which is hypertext-driven, rather than having some JSON client which knows how to PUT despite PUT not being defined by application/json. *That* a resource may be manipulated is a given in an HTTP API, but a REST API must instruct a client *how* to manipulate a resource.)

    • roby2358

      > You’ve now left the REST reservation… but, your intentions are pure, so I’ll do my best to help.

      Thank you. 🙂 Your viewpoint is valuable, and I’m noodling over the rest of your reply. There’s some good stuff there.

      I think part of the problem is my understanding of the terms. I don’t think I have “resource” and “representation” down completely right yet. I’m getting the sense that in REST, “resource” is more like an abstract class than a specific thing, or even a specific type. And “representation” is more like the serialization scheme.

      However, I also have the nagging feeling that there’s a subtle impedance mis-match in between “resource-based architecture” and “REST resources”.

      This is good, though. I need to tighten up my understanding of the terminology and usage, and I also need to figure out where The Industry is adding blurriness to the conversation.

  2. Eric J. Bowman

    “The data might map to something ephemeral, like a search result. A search result isn’t really a resource in the system, it’s the *representation* of some *work* that I just did.”

    Nope, it’s a resource… really. 😉 Perhaps it would help to think of a resource as a stored procedure. This is a resource:

    http://www.google.com/search?q=google

    I know it’s a resource because it has a URI.

    It’s a stored procedure whose results vary over time, i.e. “search the index for instances of the word google, and rank according to secret sauce”. The results may vary every time I GET that resource, but its conceptual mapping remains the same — in that way, it is *not* ephemeral.

    As Roy put it in his thesis, “A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.” A search response entity is ephemeral, Google’s algorithms may change, but the conceptual mapping remains static.

    “I *might* create a key for that search, shove it in a data store, and treat it as persistent, but odds are I won’t.”

    Or, you *might* assign a URI for that search, which executes the search on GET, and treat that as persistent…

    • roby2358

      > Perhaps it would help to think of a resource as a stored procedure.

      As a nit pick, for your example I would say the Resource is the *result set* that comes from executing that stored procedure.

      My RESTful API probably won’t expose the stored procedure itself… unless I specifically map a (different) url to it’s T-SQL text.

      Does that fit?

  3. Pingback: This Week in #REST – Volume 10 (Mar 29 2010 – Apr 4 2010) « This week in REST

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s