Tag Archives: open standards

REST — Verbs or Adverbs?

OK here I go with right-brain drift again.

The premise of the uniform interface for REST is appealing: you get 4 verbs, and you need to define your nouns in a way that leverages those 4 verbs. Without abusing POST.

So what if we defined our classes that way? What if I defined an interface with the following signatures:


public interface Restful<T>
{
T get(Map<String,String> query);

Object put(T o);

Object delete(Map<String,String> query);

Object post(Object input);
}

I could certainly write a program where all my objects used that interface. It would certainly conform directly with The Interwebs, because I’d just have to have some sort of list of what objects were visible, and my application container could support remote HTTP communication with practically nothing in between.

It would look really weird though, and I think that’s why the Java folks moved instead to annotations: @GET @POST etc. That way, they could keep their regular object-oriented syntax, but still figure out where to hang hooks for RESTful calls.

So why are we, as a herd, squeezing ourselves into the REST bottle?

Maybe the answer is that the HTTP/REST verbs should not be taken as verbs, but rather as adverbs. That is, a method that you invoke GETly has certain properties: it’s probably cacheable, it takes simple input and returns complex data, etc. In fact, we could describe the RESTful adverbs this way:

    GET

  • cacheable, with constraints
  • input is a a map of string to string
  • output is a complex (i.e. structured) object
  • output data is typed according to the class of the URI
  • safe — i.e. does not intend to alter data
  • idempotent — i.e. repeatable
  • PUT

  • sets or updates data
  • complex input data
  • no output aside from status messages
  • input data is typed according to the class of the URI
  • not cacheable
  • unsafe (data changes)
  • idempotent (repeatable)
  • DELETE

  • removes data from the system
  • no input — just the identifier
  • no output aside from status messages
  • not cacheable
  • sort of unsafe (data changes, but its a implementation
    decision whether deleting already deleted data is results in an
    error response)
  • idempotent (repeatable)
  • POST

  • the method takes an arbitrary input object and returns an arbitrary object
  • input is a complex object (i.e structured data)
  • output is a complex object
  • input and output data types aren’t necessarily related to anything else
  • not cacheable
  • not safe
  • not idempotent

So I could have an class, where the accessors would look like

@GETly
public String getName()

@PUTly
public String setName()

…because the getters and setters play nicely with the RESTful adverbs.

But other methods might play nice, too. For example, if you have a recalculate method that recaculates and sets a bunch of internal values in the class, that method works POST-fully


@POSTly
public String recalculate()

And so on. A lot of the methods in my system — but I suspect, less than we’d expect — would look like a good old-fashioned input/output, like the classic POST case.


@POSTly
public ThozzRegulator calibrate(CalibrationTemplate template)

In fact, if we set aside the data semantics, maybe the RESTful adverbs really classify the fundamental behavior of all methods:

safe not safe
idempotent GET PUT, DELETE
not idempotent n/a POST

We’d just need a verb that is safe but not idempotent (!??!?!?) and then we could apply an adverb to every method in our system. šŸ˜‰

Advertisements

3 Comments

Filed under opinionizing, REST

Is REST === ROA?

I’ve had this feeling of unease around REST for awhile, and I’m coming to the conclusion that there’s a central disconnect in the abstraction we’re using for the RESTful model.

REST is “Representational State Transfer”, so at least nominally it centers on the ideas of “state” and “representation”. But the state in a RESTful communication is tied to the representation, and only indirectly to the underlying resource.

That is, if I request a data resource using a GET, I get data back that represents the state of something in the system. But the data, whether it’s XML or JSON or tab-delimited lines, is just a convenient way to represent the data, rather than a pure serialization of the underlying database rows or whatnot.

For example, I might throw some things into the representation, like calculated values, or maybe hyperlinks, that don’t exist in the underlying object. Or I might leave some stuff out that the client has no business seeing.

What makes it worse is that the “pure” data objects in my system might have several representations. Or else I might be able to GET resources that combine several underlying data objects. Or a lot of underlying data objects. Worse, the GET might return a representation of a calculated result, for which there’s no persistent underlying data object at all.

There is this idea, that I am fond of, that RESTful services are the same as Resource-oriented Architecture. That if you expose a service, follow all the REST rules, and diligently follow the GET PUT POST DELETE model, then you have an ROA system.

But trying to claim that REST is ROA puts us in a hard place when we look at questions like, “What about searches?” Search is one of those hard problems in the REST world, because it clearly belongs there, and yet there isn’t a persistent underlying resource that maps to a search result.

Ultimately, I think it comes down to a shared misconception that RESTful communications are Resource-oriented. But I don’t think that’s right — they are Representation-oriented.

I’m still kicking the idea around, but in the end I think we’re going to have to get rid of the idea that REST is ROA. They are very compatible, but still not the same.

There are already standards around for communicating data structure as well as data, but I think we’re going to have to rely on those to provide our ROA. REST is a useful model for shaping communications to remote services, but there’s still a big disconnect with what we’d really expect from a true ROA.

EDIT: I just did a little more reading, and it sounds like it boils down to: “resource” doesn’t have a single definition. Lots of specs mean different things when they say “resource”. So that is an area of emerging clarity. In essence, saying you’re doing “ROA design” is like saying you’re doing “?OA design”. I guess I’m too practical, so it’s easy for me to discount the “resource” part and focus on the practical “representation” side. šŸ˜‰

4 Comments

Filed under REST

OAuth and Auth …entication? …orization?

I had a funny experience a while back where I had a manager who tasked me with adding security to the service we were working on. He did me the favor first of going into the code and adding the core classes I would need.

…uh, thanks.

But our architectural direction was toward OAuth. So when I looked at the classes he’d created, I noticed they were all “Authenticate” or “Authenticated” names. Not “Authorize” or “Authorized” names.

Ouch. OK so he didn’t get the memo.

Then I moved to another team, and things were going swimmingly, until I got in an argument with my new manager about whether we were standing up an “Authorization” service or an “Authentication” service.

Sheesh. I guess great minds think — or don’t think — alike. History replicates itself in miniature.

Let’s be clear: OAuth is an *authorization* scheme, about allowing or disallowing access to resources and operations. It’s really about protecting data, and whether the owner of that data has authorized access to that data. Authentication is left as a “fill in the blank” step early in the process.

Authentication in OAuth is a moment in time — it’s a blip that happens early in the OAuth dance. The user supplies credentials, we validate them, and then we’re done with that. OAuth is not Basic Auth where, the client sends username and password every time. OAuth is the opposite of that. The client supplies credentials once, and if that’s OK, we get a token and don’t worry about it again.

In OAuth, the whole point is to provide an authorization scheme. That is, a client coming in to make a request is either authorized to make that request or not.

To use a metaphor, only people wearing Blue badges can get through this door. That’s the authorization process.

To further the badge metaphor, the guard at the front gate will check your driver’s license and the photo on your badge *once* to make sure you are you you say you are. That’s the authentication process.

After that you’re free to roam around where you badge authorizes you to go.

So I guess this post is for my *next* manager, anticipating the argument we’ll have over the difference between authentication and authorization in OAuth.

Leave a comment

Filed under OAuth

Is OAuth Stateless? Can it work for REST?

I spent a good chunk of the last year putting together a couple of different implementations of OAuth for internal use by our web services, and for some admin UI interfaces. The architecture guys were trying to push other development teams besides ours in that direction, too, so that we could have our services inter-operate, with a single security model.

Because I work in a large, multi-national corporation, nothing is that easy. There was lots of push back, and lots of opinions about how we should provide authorization around our web services and pages.

Or… lots of monkeys getting passionate about how to peel their bananas.

But one complaint that was kind of annoying was a standard rallying cry of the detractors that OAuth is not stateless. Ergo, it violates the principles of REST, and, ergo, is not consistent with the overall architectural direction.

So here is my take on that. I’m interested in what others think.

First, if the architecture team says, use REST, and then later they say use OAuth, it’s kind of silly to say the one rule disqualifies the other. Every technical choice is a series of considerations, and, one way or another, at the end of the day authorization and security concerns are going to trump just about anything else.

Second, how can you complain about OAuth’s state-ish-ness, and then turn around and put SSL in front of a REST service? I mean, come on.

However, ultimately, I’d like to come up with a better answer than “stop whining”, so I’ve been noodling on it some. Here’s what I think.

While the OAuth protocol is not stateless, because it requires the user to pass credenitals one time, and then maintain state of the user’s authorization on the server side, these are not considerations of the underlying HTTP protocol. It’s a higher-level concern. It’s the same as passing a “login” cookie or some other session token so the server can keep track of the user. Which is something we do all the time.

The point of statelessness is to make the servers on the REST side anonymous — so you can bring them up or tear them down at will, and leave the health of your service intact. If I have information held for clients on a particular box, and it goes down, then I have a problem, because all the client interacting with that box have lost their state.

But the kind of state that OAuth maintains is above the level of the HTTP protocol, and represents a generalization of an application concern. So really it’s not the individual *server* that cares about your OAuth token, it’s the *application* that cares.

Which means you can push your OAuth token into a distributed cache like memcached, and your individual servers are safe. OAuth, while it requires a sort of “session” state, doesn’t affect the state of any particular server, and doesn’t force you to provide server affinity.

In other words, yes, OAuth is stateful, but not the way you’re worried about. It doesn’t require the RESTful servers to be stateful.

In our implementation, we really quickly gave up on providing OAuth as a solution inside the application code itself, and almost all of our REST services externalize OAuth. For incoming traffic we use a reverse proxy in front of the service, and on the way out we use a regular proxy going to remote services. The “in” guy provides validation and bounces any unauthorized requests before they get to the REST server. The “out” guy adds the necessary headers to any outgoing request to satisfy OAuth.

That’s even nicer than terminating SSL in our load balancers, and I don’t hear anyone complaining about that.

I suppose I’ll have to continue scowling at the smug faces of developers who are convinced they’ve defeated OAuth with logic, but really I don’t see a problem with OAuth in a stateless server world.

Probably there’s a more concise way to put it, though, and I’d love to hear that.

So, to be clear, OAuth works well with REST, and doesn’t compromise any REST-ful-ness requirements for a server.

There are two ways to look at it:

+ authorization is part of the application concern, and part of a resource definition. Remember that the definition of Resource isĀ arbitrarily large. So, “A list of ___ as allowed in this authorization scope.”

…or…

+ as a transparent concern provided by middleware, exactly as SSL (or even TCP) is. SSL doesn’t compromise the statelessness/REST-ful-ness of a service, despite it’s stateful protocol. One of the fundamental ideas of REST is to line up services in a way that allows dropping middleware in the flow to take care of concerns like routing, load balancing, encryption, authentication and authorization, without compromising the reliability of scalability of a resource server.

So OAuth, or really any token-passing authorization scheme, doesn’t compromise the validity of a REST service that it protects. You aren’t required to use Basic Auth, or proscribeĀ SSL, everywhere to defendĀ the REST-ful-ness of your service.

(However, statelessness is a good idea in itself … but the bigger issue for REST is that very few teams are implementing hypermedia in their service, so hardly anyone is implementing REST anyway.)

4 Comments

Filed under OAuth, REST

OAuth 1 vs. OAuth 2?

So the world was shocked recently when the new OAuth draft spec came out, and it was completely different from OAuth 1. It’s a much better spec overall, but still there’s a lot of work that has gone into the 1.0 implementations.

But then it occurred to me that since 1.0 and 2.0 have almost no overlap, there’s no reason why one has to obscure the other. So right now, I’m treating OAuth as an extention to 1.0 that adds a Bearer Token protocol to the original spec. We can support that.

Right now, internally, we’re only using 2-legged OAuth anyway. And the work I need to do on the Bearer Token scheme will, ironically, add the pieces that I’m missing for a complete 1.0 3-legged implementation.

So once I started thinking of it that way, it reduced the angst over spec drift greatly.

Currently my biggest grief comes from how much detection I want to do, because I could certainly walk through the protocols and, for example, upgrade from an Oauth 2 Bearer Token to an OAuth 1 3-legged request.

Now my grief is coming from the architecture guys — someone has decided that he wants an implementation that’s “more lightweight” than what I developed (even though I told them for months what I was going to do, did it, scored a couple of HUGE wins with it, etc.). So right now my plan is to open-source the core classes of my implementation, then just wait for someone to show up and cut my throat. But, hey, that’s life in the big city.

Leave a comment

Filed under OAuth

Getting started on OAuth 2.0

Sure, the OAuth 2 spec is still a draft, but so is the project we’re working on. I expect that by the time anyone is ready to talk to us, 2 will be solid enough and have enough adoption that we’ll have lots of partners to play with.

So far the questions we’re addressing are pretty basic.

  • Can we use OAuth 2 to protect web pages?

This was sort of out of the question in 1.0, but the Web Server flow is most of the way there to providing a bearer token scheme for protecting some of the admin tools we’re standing up.

  • Can we pass an access token in a cookie instead of an Authorization header?

I mean, why not? The reason cookies were invented was exactly this: to pass “magic cookies” that would let you get past the authorization troll.

  • Can we attach application-specific name value pairs to an access token?

Right now it looks like we can pass keywords in the “scope” value of a request. While that’s just a space-delimited list of keywords, nothing in the OAuth 2 spec says they can’t be name value pairs. As long as there’s no spaces in them.

In the end, we’ll probably fall this way:

Protect web pages? Yes.

Use cookies? Yes, in addition to the Authorization header and query string parameter.

Use name value pairs in “scope”? No, keywords should be enough for us.

On the last point, we may implement a second call to the Authorization Server to get more information tied to the access token than we got up front. Ya, that’s an extra hop, but right now we’re working with low-volume admin apps, so that should be OK.

Oh, and the other fun part is that we’re building an OAuth reverse proxy that sits in front of the webapp and takes care of the authorization — kind of like terminating SSL at the load balancer.

So far, pretty fun stuff.

Leave a comment

Filed under OAuth

Will REST give us an Internet OS?

We were in a week of training, and it was pretty exhausting. The last day was the most interesting, because we got into the “advanced” stuff. The guy training us was a really smart guy, and had some good ideas. At one point he offered his vision of a sort of file system distributed across the web, where he could have, say, pictures scattered all over the place and just pull them in.

I perked up at that and observed that’s more or less the vision of REST… that by making access to resources uniform, you could just go out and grab them from whatever service was holding them at the time.

I didn’t mention that I hold a patent — for what that’s worth hehe — or at least my employer does šŸ˜‰ — on a system for managing arbitrary resources by relating URIs to each other, with a state machine for managing the lifecycles of those relationships. Which is kind of part of what he was talking about.

Then, in another conversation, one of the guys on my team — a really, really sharp guy — was creating a RESTful interface for launching Map/Reduce jobs in Hadoop. As we were chatting I recommended he actually expose three addressible resources for that purpose: a mapper resource, a reducer resource — and a control resource that ties the other two together through URIs.

Anyway, the upshot of all this is that as I pondered it some more, it occurred to me that I’ve always been talking about REST in the context of *services*. That is, how cool would it be if my service were just like yours and I didn’t have to spend 2 days of coding to write a client to your service.

And that’s a noble thought, but I think the broader, more powerful model that’s going to emerge is one of combined data and processing services across the internet. What would it take to turn the Internet into a giant OS?

An OS needs to store data, and it also needs to provide execution units. And we’re getting there with cloud computing. But the units of execution are still tightly bound to the idea of “a box”. We call our boxes “virtual”, but they are still boxes.

So what if “the box” an application ran on was The Internets?

This blog is a place for me to throw out (up?) any half-baked, often whiny ideas that pop into my head, so I don’t know if that’s just me being artistic, or it’s me missing the boat by 5 years again, or if that’s actually a good idea. But it seems like there’s something there.

And maybe if that happens, I can put that stupid patent of mine to work, finally.

1 Comment

Filed under REST