Tag Archives: computer architecture

Interfaces, Inside and Out

Here’s a neat little practice — splitting applications into pieces for “interface” and “engine”.

Once split up that way, it’s easy to shuffle classes around, because the “interface” classes are anything a remote client would need to see, and the “engine” are the pieces we can safely hide from the world. You can put them in an interface.jar, and distribute that, and an engine.jar, and deploy that.

That works really well, but there are a couple of surprises.

Not surprisingly, the basic classes in the interface package are interfaces. They simple define the boundary of the logic that lives in the engine. If my service is called the Wooga service, then I’ll have an interface class called Wooga.java that defines the methods my service exposes. There might be more than one, when I want to split my service into smaller pieces.

The first surprise, the data model classes go into the “interface” package. At first that seems odd, because it seems like the data model is an internal, hidden thing. But think about it: if I write a client and expose my data objects to that client, why should I have an “internal” package for data objects, and an identical “external” package?

No, data objects are part of the interface. If I expose a method called “calculatePayment” in the interface package, then it returns a PaymentSchedule object, then of course the returned object’s class is going to have to be in the interface too.

What it boils down to is that the defined interface for the application includes the data model classes that are shared with the client.

Sometimes there are internal data model artifacts — maybe related to authorization, accounting or monitoring — that of course don’t show up in the interface. But again, that’s fair. They go in the “engine” where they can safely be hidden from any consumer of my application.

Another surprising consequence of splitting interface and engine is that a remote client to the service and internal implementations of the service can implement the same interface. Which means they are transparently pluggable. That has a lot of power.

If you don’t want to pay the price of a network hop and additional infrastructure — you don’t want to run a separate box — just drop the engine.jar locally with your application and invoke the engine directly through the defined interface.

If you want to split it out later, just remove the engine.jar and leave interface.jar, and configure your application to use the remote client instead of the engine. You won’t have to make any code changes to your application. It has no idea the engine is now remote, except that everything takes a little longer.

I usually also throw my client implementation into the interface package as well, providing for serialization and deserialization of parameter objects and response objects. A client isn’t limited to this implementation, but in most cases they’re not interested in writing their own anyway.

Part of the motivation for this came from the idea of a “Hexagonal Architecture”, which is a response to the old “N-Tiered Architecture”. The idea is that a software component has a hidden internal implementation, which is accessible through a variety of interfaces — not just one as the N-Tiered architecture usually depicts.

The hexagonal component will also talk out to other components through their “interface” packages. This makes for a composable system, where pieces can be co-located, distributed, tiered, load-balanced, etc. in a very flexible and mostly transparent way.

More about the Hexagonal Architecture:

http://alistair.cockburn.us/Hexagonal+architecture
http://c2.com/cgi/wiki?HexagonalArchitecture

Advertisements

Leave a comment

Filed under computer architecture

Is REST === ROA?

I’ve had this feeling of unease around REST for awhile, and I’m coming to the conclusion that there’s a central disconnect in the abstraction we’re using for the RESTful model.

REST is “Representational State Transfer”, so at least nominally it centers on the ideas of “state” and “representation”. But the state in a RESTful communication is tied to the representation, and only indirectly to the underlying resource.

That is, if I request a data resource using a GET, I get data back that represents the state of something in the system. But the data, whether it’s XML or JSON or tab-delimited lines, is just a convenient way to represent the data, rather than a pure serialization of the underlying database rows or whatnot.

For example, I might throw some things into the representation, like calculated values, or maybe hyperlinks, that don’t exist in the underlying object. Or I might leave some stuff out that the client has no business seeing.

What makes it worse is that the “pure” data objects in my system might have several representations. Or else I might be able to GET resources that combine several underlying data objects. Or a lot of underlying data objects. Worse, the GET might return a representation of a calculated result, for which there’s no persistent underlying data object at all.

There is this idea, that I am fond of, that RESTful services are the same as Resource-oriented Architecture. That if you expose a service, follow all the REST rules, and diligently follow the GET PUT POST DELETE model, then you have an ROA system.

But trying to claim that REST is ROA puts us in a hard place when we look at questions like, “What about searches?” Search is one of those hard problems in the REST world, because it clearly belongs there, and yet there isn’t a persistent underlying resource that maps to a search result.

Ultimately, I think it comes down to a shared misconception that RESTful communications are Resource-oriented. But I don’t think that’s right — they are Representation-oriented.

I’m still kicking the idea around, but in the end I think we’re going to have to get rid of the idea that REST is ROA. They are very compatible, but still not the same.

There are already standards around for communicating data structure as well as data, but I think we’re going to have to rely on those to provide our ROA. REST is a useful model for shaping communications to remote services, but there’s still a big disconnect with what we’d really expect from a true ROA.

EDIT: I just did a little more reading, and it sounds like it boils down to: “resource” doesn’t have a single definition. Lots of specs mean different things when they say “resource”. So that is an area of emerging clarity. In essence, saying you’re doing “ROA design” is like saying you’re doing “?OA design”. I guess I’m too practical, so it’s easy for me to discount the “resource” part and focus on the practical “representation” side. šŸ˜‰

4 Comments

Filed under REST

Is OAuth Stateless? Can it work for REST?

I spent a good chunk of the last year putting together a couple of different implementations of OAuth for internal use by our web services, and for some admin UI interfaces. The architecture guys were trying to push other development teams besides ours in that direction, too, so that we could have our services inter-operate, with a single security model.

Because I work in a large, multi-national corporation, nothing is that easy. There was lots of push back, and lots of opinions about how we should provide authorization around our web services and pages.

Or… lots of monkeys getting passionate about how to peel their bananas.

But one complaint that was kind of annoying was a standard rallying cry of the detractors that OAuth is not stateless. Ergo, it violates the principles of REST, and, ergo, is not consistent with the overall architectural direction.

So here is my take on that. I’m interested in what others think.

First, if the architecture team says, use REST, and then later they say use OAuth, it’s kind of silly to say the one rule disqualifies the other. Every technical choice is a series of considerations, and, one way or another, at the end of the day authorization and security concerns are going to trump just about anything else.

Second, how can you complain about OAuth’s state-ish-ness, and then turn around and put SSL in front of a REST service? I mean, come on.

However, ultimately, I’d like to come up with a better answer than “stop whining”, so I’ve been noodling on it some. Here’s what I think.

While the OAuth protocol is not stateless, because it requires the user to pass credenitals one time, and then maintain state of the user’s authorization on the server side, these are not considerations of the underlying HTTP protocol. It’s a higher-level concern. It’s the same as passing a “login” cookie or some other session token so the server can keep track of the user. Which is something we do all the time.

The point of statelessness is to make the servers on the REST side anonymous — so you can bring them up or tear them down at will, and leave the health of your service intact. If I have information held for clients on a particular box, and it goes down, then I have a problem, because all the client interacting with that box have lost their state.

But the kind of state that OAuth maintains is above the level of the HTTP protocol, and represents a generalization of an application concern. So really it’s not the individual *server* that cares about your OAuth token, it’s the *application* that cares.

Which means you can push your OAuth token into a distributed cache like memcached, and your individual servers are safe. OAuth, while it requires a sort of “session” state, doesn’t affect the state of any particular server, and doesn’t force you to provide server affinity.

In other words, yes, OAuth is stateful, but not the way you’re worried about. It doesn’t require the RESTful servers to be stateful.

In our implementation, we really quickly gave up on providing OAuth as a solution inside the application code itself, and almost all of our REST services externalize OAuth. For incoming traffic we use a reverse proxy in front of the service, and on the way out we use a regular proxy going to remote services. The “in” guy provides validation and bounces any unauthorized requests before they get to the REST server. The “out” guy adds the necessary headers to any outgoing request to satisfy OAuth.

That’s even nicer than terminating SSL in our load balancers, and I don’t hear anyone complaining about that.

I suppose I’ll have to continue scowling at the smug faces of developers who are convinced they’ve defeated OAuth with logic, but really I don’t see a problem with OAuth in a stateless server world.

Probably there’s a more concise way to put it, though, and I’d love to hear that.

So, to be clear, OAuth works well with REST, and doesn’t compromise any REST-ful-ness requirements for a server.

There are two ways to look at it:

+ authorization is part of the application concern, and part of a resource definition. Remember that the definition of Resource isĀ arbitrarily large. So, “A list of ___ as allowed in this authorization scope.”

…or…

+ as a transparent concern provided by middleware, exactly as SSL (or even TCP) is. SSL doesn’t compromise the statelessness/REST-ful-ness of a service, despite it’s stateful protocol. One of the fundamental ideas of REST is to line up services in a way that allows dropping middleware in the flow to take care of concerns like routing, load balancing, encryption, authentication and authorization, without compromising the reliability of scalability of a resource server.

So OAuth, or really any token-passing authorization scheme, doesn’t compromise the validity of a REST service that it protects. You aren’t required to use Basic Auth, or proscribeĀ SSL, everywhere to defendĀ the REST-ful-ness of your service.

(However, statelessness is a good idea in itself … but the bigger issue for REST is that very few teams are implementing hypermedia in their service, so hardly anyone is implementing REST anyway.)

4 Comments

Filed under OAuth, REST

Will the Supermachine squash the Virtual Machine

We’re in training this week for a system provisioning tool called RPath, which is really cool stuff.

http://www.rpath.com/corp/

It’s similar to a lot of configuration management tools available today, except for entire systems. There is a GUI, or command line tool, that allows you to compose a system from all the pieces you need — and just the pieces you need — and a ready-to-run image of the system pops out the other end. It can produce VM images, tars, or even ISOs (I believe). Cool stuff.

Anyway….

As slick as that is, and as we continue to go down this path of virtualization, every time I log onto a virtual box that I just created, I really have to wonder why I’m doing that. In a world where applications are deployed as virtual appliances, I’ve been getting the feeling more and more that having a full infrastructure set up to run a single servlet seems really kind of backwards and heavyhanded.

For example, these appliances tend to be isolated from broader networks by private IP addresses, and allow admin interaction only through ssh. User interaction is only through standard, narrowly-confined ports. So why do we need a password file or user permissions? As an appliance with dedicated functionality baked in to it, there’s little reason to log in to the thing at all… why do I need all the complexity of a multiuser system?

I do a “ps -eaf” on one of these boxes and see all sorts of processes running. Why? What are they doing that needs to be rolled up into a system isolated from all the others?

It seems like the model the technical world is moving to, is that the *real* operating system is the virtual hosting environment. It’s the technology that reels in all the physical hosts, and hands out CPU, disk, and network for services to run with. Why am I deploying to a virtual ‘nix or ‘does box at all?

I have the feeling that there is this awesome technology that tears down a lot of the constructs that, granted, were critical to getting us here in the first place. But now these virtual, dedicated appliances only do one thing. There aren’t 16 or 32 users logged in at a time. There aren’t multiple user’s files scattered around a file system, requiring security and permissions.

Seems like the problem is we have a collection of resources: disk space, CPU time, network bandwith. And we have a number of tasks to complete. Virtual machines marry those two in creative ways, but wouldn’t we be well served to take a step back and look at a more direct way to bundle all that together.

I guess an analogy would be this: it’s seems like we’re die-casting instances of Stone Axes in solid titanium, because Stone Axes are well-understood, comfortable and familiar.

I don’t think a lot of the virtual or “cloud” providers will have any trouble if the traditional computer system just kind of evaporates into the cloud. They are very flexible about the kind of appliances they can bust out.

But I’m curious to see how that will unfold.

EDIT: lol should have read about Google App Engine before I wrote this šŸ˜‰

Leave a comment

Filed under cloud computing, computer architecture

Will REST give us an Internet OS?

We were in a week of training, and it was pretty exhausting. The last day was the most interesting, because we got into the “advanced” stuff. The guy training us was a really smart guy, and had some good ideas. At one point he offered his vision of a sort of file system distributed across the web, where he could have, say, pictures scattered all over the place and just pull them in.

I perked up at that and observed that’s more or less the vision of REST… that by making access to resources uniform, you could just go out and grab them from whatever service was holding them at the time.

I didn’t mention that I hold a patent — for what that’s worth hehe — or at least my employer does šŸ˜‰ — on a system for managing arbitrary resources by relating URIs to each other, with a state machine for managing the lifecycles of those relationships. Which is kind of part of what he was talking about.

Then, in another conversation, one of the guys on my team — a really, really sharp guy — was creating a RESTful interface for launching Map/Reduce jobs in Hadoop. As we were chatting I recommended he actually expose three addressible resources for that purpose: a mapper resource, a reducer resource — and a control resource that ties the other two together through URIs.

Anyway, the upshot of all this is that as I pondered it some more, it occurred to me that I’ve always been talking about REST in the context of *services*. That is, how cool would it be if my service were just like yours and I didn’t have to spend 2 days of coding to write a client to your service.

And that’s a noble thought, but I think the broader, more powerful model that’s going to emerge is one of combined data and processing services across the internet. What would it take to turn the Internet into a giant OS?

An OS needs to store data, and it also needs to provide execution units. And we’re getting there with cloud computing. But the units of execution are still tightly bound to the idea of “a box”. We call our boxes “virtual”, but they are still boxes.

So what if “the box” an application ran on was The Internets?

This blog is a place for me to throw out (up?) any half-baked, often whiny ideas that pop into my head, so I don’t know if that’s just me being artistic, or it’s me missing the boat by 5 years again, or if that’s actually a good idea. But it seems like there’s something there.

And maybe if that happens, I can put that stupid patent of mine to work, finally.

1 Comment

Filed under REST

The Central Problem in REST

So after much deep thought, I have uncovered the central problem in the modern interpretation of REST. I’ve had this nagging feeling that something doesn’t quite line up between “resource-oriented” design, and the restrictions on design declared for the sake of a “uniform interface”.

It’s not so much a problem with the original REST architecture, but a conflict between that original vision, and the notion of what we call RESTful these days.

Specifically, the “uniform” part of the uniform representation is there so a server can serve up data that a consumer can just pick up and render in a uniform way. So JPG images, HTML pages, and PDFs are all perfect examples. A browser just gets the data, then renders them. It doesn’t know or care what’s inside the representation, because it conforms to a uniform standard.

But the moment you have your REST service send back an application-specificĀ representation, it’s not uniform anymore. Suddenly the client needs to know what the fields *mean* in order to use that information.

And it doesn’t matter if you’re using the HTTP verbs or not. As soon as you return any representation that only a specific set of clients are specially coded to understand, then it’s not uniform anymore.

The classic example where this shows up is the problem REST has with search. All the search engines have different URLs for their searches, and all of them return different representations. So they are, on their face, not uniform.

Another place where the central problem shows up is schema versioning. If the client can only understand a certain version of the representation, that conforms to a specific schema, then REST doesn’t have a good answer for that. The community is trying to come up with an acceptable standard: use contentĀ negotiation, put it in the URL, put it in a query string, put it in a special header? But there isn’t a clear answer, or a clear winner, because REST isn’t about how to manage non-uniform resources.

Designing your service interfaces closely to the HTTP spec is a good idea, and leads to nice clean design. But it still isn’t consistent with REST, even if you disavow RPC and use all the HTTP verbs right. But then we should stop talking services and how REST-ful they are, and start talking about HTTP-ish they are.Ā Citing REST just doesn’t work when you’re serving up specialized, non-uniform application data that requires specialized logic on the client side.

In the end, it may turn out that REST just isn’t anĀ architectureĀ well-suited to the purpose of sending around application data.

4 Comments

Filed under REST

Representation … Resource … and …?

OK I’m trying to get my brain wrapped around the terminology in REST. I think the important thing to bear in mind that REST is a client-server, client-pull-messageĀ (or client-message-push) architecture. So it’s about how to identify and move information initiated by a client.

The main things that REST boils down to are:

  • Identifier: some arbitrary string which points at an instance of data from an abstract class of data
  • Representation: just some way to bundle up information for transmission
  • Resource: a conceptual idea of some information you want to either get, or put in place
  • Static resource: a resource that is backed by some data that’s going to stick around for a while, so you expect to get the same information back for multiple retrievals over time, more or less
  • Dynamic resource: a resource that is more ephemeral, like a calculation done on the fly, or an aggregation of other data that might be changing rapidly
  • …? : behind the resource, there will be some data storage, or processing, that results in a bundle of data we can call a resource and roll up into a representation and point to with an identifier

I see the last bit called an “entity” frequently, but really the REST architecture definitions, including Roy Fielding’s, mostly stop at the resource Ā level and leave the rest to our imagination.

Personally, my main stumbling block is at the use of the term “resource”. I think of that word as something static, or even worse, a specific instance that I can put my hands on. So when I say, “a resource”, I usually think “the blue mouse in the top drawer of my desk.” In common usage, that blue mouse is a resource. It’s something I can use. But in the REST world, the Resource is “computer mouse”, and the IdentifierĀ adds “/desk/drawer?which=top”. So what I normally think of “a resource” in common usage is really a Resource + Identifier in REST parlance.

In the OO world, an abstract class is a Resource, and an instance is Resource + Identifier. More or less.

I think one of the main reasons for all the blurriness is that what we call REST is really a collision of several worlds:

1) The REST architecture itself, as laid out by Roy Fielding. Which is really more about ways to string together client-server systems in a uniform way, granting interoperability and scalability at the expense of efficiency.

2) The HTTP protocol, which is the primary protocol people use to implement REST. It’s not at all correct to say “REST is HTTP”, but if you are doing the REST thing, and putting it on top of HTTP, then REST commands you, “Use HTTP strictly! Don’t use your own personalized variation of HTTP!”

A personalized variation of HTTP is essentially what 99% of the industry uses today.

After all, despite it’sĀ pretensions, the software industry enjoys freedom from the tyranny of rational thought.

3) The ROA, or resource-oriented architecture crowd, which tends to take the basic terminology of REST and wrap it around a design based heavily on “static resources” or “nouns”. I say noun-based and not object-based, because the strict ROA guys require hyperlinks, but don’t allow intermixed data and methods like you’d see in OO.

4) All the personal preferences, biases, superstitions, agendas and personality disorders of everyone who’s involved with developing software for the Internets. Which, taken together, somehow fails to oblitherate points 1-3. Most of the time.

Based on the conversations I’ve seen where I work, the primary religious conflict is between the static resource and dynamic resource crowd. Let’s face it, the Internets were built not just on a series of tubes, but on the GET/POST verbs. Chopping RPC out of REST flies in the face of what Fielding was trying to do in his paper — to capture what made the Internets work.

For example, some guys in my shop who were brave enough to dive into the Flex realm have been horrified to discover that Flex doesn’t support even the basic set of functionality required to implement a static-resource-over-strict-HTTP design. It basically just supports the GET/POST model, and no more. For some reason, the Flash guys thought that was all they needed.

All that said, I fall heavily on the “noun” side of that religious war, if only because it’s more challenging and fun.

So the upshot is, from my readings, I think that REST is a very open-ended idea that almost every web developer has been using for the past 10 years anyway, just because that’s the way browsers and web servers work. Now we’re getting strict about the HTTP protocol — which is a really good thing — and raising the awareness of the power and simplicity of noun-based design. Which is also a good thing.

But I’m going to try to keep the terminology straight, because I think the core ideas of REST are worth keeping in mind, and it’s important to keep the terminology clear in discussion.

Not that I’m good at that, especially where it doesn’t help my argument.

2 Comments

Filed under computer architecture, REST