Tag Archives: map reduce

REST is… representation?

[Note: since writing this, my understanding of the “real” REST terminology is getting better, so I’m not sure a lot of this discussion is on mark. I would delete the post altogether and start over, except that the comments are actually better than the post. So I appreciate the comments, and I’ll let the post stand, to my shame to honor those comments …just heads-up.]

We’re in the process of creating a big data aggregator, which is going to be a massively distributed and parallelized data dump, where all our business units can just dump their data, and then that data will be available for them to run whatever map-reduce jobs they want. The problem I’m thinking about now is how to make that work, when the form of that data will vary from source to source, and even for a given source, it will vary over time.

Also, as our shop starts to put together all these disparate REST systems, every team is putting their own spin on the problems of URI definition and resource representation. It would be cool to pretend that we could just draft a standard, and it would be clear and perfect, and everyone would code to it strictly. But whoops — I work in the software industry, not in Nirvana. So that’s not going to happen.

Or to use another metaphor which is appropriately vaguely insulting, in this zoo, I work in the monkey house. At least, not anywhere that reason dominates.

So I’m starting to think about representation. We have transformation technology lying around that will help us get there, at the expense of CPU cycles, but it means laying out all the resources that span our business space in a sort of directed graph, and filling in the links with transformations. Do-able, but people internally are going to whine a lot. Which will be a side benefit.

So I started thinking about REST — representational state transfer. In kicking ideas around, I’m wondering whether it’s correct to call REST resource-based, when really it’s representation-based.

If I come to a REST service, it’s because I want to GET or PUT some data (ignoring the bad-boy POST and the troublesome DELETE for now). And I’m going to GET or PUT the data I’m interested in a particular representation.

That data will probably map to some underlying resource — but it might not. The data might map to something ephemeral, like a search result. A search result isn’t really a resource in the system, it’s the *representation* of some *work* that I just did. I *might* create a key for that search, shove it in a data store, and treat it as persistent, but odds are I won’t.

A resource might be mapped with multiple URLs, and it might have multiple representations. A resource might be included in the representation of another related resource.

Even if I’m grabbing data from another source — let’s say a table of user accounts — the service is going to decide how deep to drill to add related records to the representation. My representation might be a very deep rendering of the data, or it might be shallow, with links that tell me where to get the rest of it.

It might come back as XML with a strict XSD schema, or it might just come back as a JSON bundle and you take-what-you-get. It might be a raw binary stream.

There’s a many-to-many relationship between representations and resources — so why are we equating them?

The point is, I think that calling a REST service resource-based is misleading. In nice, simple systems serving up static resources like JPG images or HTML files, sure the representation maps directly and easily to the actual data in the data store. But if you step away from that and spend any CPU cycles at all “preparing” the data — even rendering columns in a table into JSON — then your interface is not resource-based anymore, it’s representation-based.

I haven’t thought this all the way through — obviously — but I’m having the nagging feeling that this is related to my problem with the big data dump, and all these disparate REST systems we’re building. I’m having the feeling that asking or even wishing for uniform representations is wrong-headed. That we should be thinking of these systems as a federation of *representation* services, not a federation of resource services.

Right now, I have the feeling that making that fine distinction will be liberating and make things easier, but I have to noodle on it some more.



Filed under REST

Functional REST

On the drive home a while back, I was thinking about the REST model, which is resource-oriented. Everything in the system is a resource. The pointers are all references to resources. The allowed operations are variations on CRUD, and no more. Well, not much more, if you’re doing it right.

So the next obvious thing is to make functions first-class objects, er… resources, as well.

I haven’t thought it all the way through, but I think that offers some interesting possibilities.

  • The resources that a client uses GET PUT DELETE on are, in fact, stateless functions.
  • The stateless nature of functional programming fits perfectly with the stateless nature of REST services.
  • Someone who has the right permissions could PUT their own function definition into the system, and then POST data to invoke it.
  • The functional language could reference other functions, naming them by URI.
  • Functions in the system could take advantage of all the meta-patterns that functional programming provides, like mapping, reducing, filtering, composing, joining, etc. I bet you could even generate lambda functions.
  • URIs are long, so you’d probably want some sort of aliasing feature 😉
  • The input for POSTing could be a combination of the usual resource representation, like JSON, and URI references to other resources.
  • Monitoring would be hell. I think you’d want to have some sort of CPU meter and an automated kill switch, depending on how much leeway you wanted to give your users.
  • The access control model would be interesting, since a wide range of data resources might be involved.

Interestingly, this is kind of along the lines of a system I put together about a year ago, where I used Java reflection to map URIs and POST bodies to a Command-pattern API that we have internally. At the time, I didn’t really really think about it as a “functional” (well, it was functional in the sense that it worked and people were really happy with it). I was just mapping URIs to the Command objects in the API using Java reflection. I didn’t think too hard about the big picture, so I don’t think it counts.

It’s kind of an interesting idea. I did a search, and haven’t seen any other references to “functional REST”, but I’ll keep my eyes open. With all the functional programming wonks running around, it’s just a matter of time before someone puts a system like that together. …Someone probably has already.


Filed under REST