More about State

Opinionizing again.

I liked the first part of my earlier post about state, but I was dissatisfied with the second part. After some subsequent discussions here, I kind of wanted to mull it over some more.

First, the basic premise in a distributed system is spreading out state information. In any remote call, you’re taking the state of some information at that moment, and either passing it back to a client, or acting on it.

Some state isn’t particularly contentious. For example, I haven’t changed my mailing address recorded in a lot of systems for a long time. And if I do need to change it, no one is going to be changing that information except me. There just isn’t a whole lot of contention over that information.

But imagine an online scheduling system. People might be reserving a time slot, cancelling their reservation, grabbing a time slot and then calling their spouse to confirm that slot is OK, and so forth. The state of any given piece of information is going to be very hotly contested. The rules around locking, updating, and committing any changes are going to be very important. Even reading the data will have to be done carefully, so people don’t make decisions based on state that is already stale.

What’s worse, there are fast transactions and slow transactions. I worked on a mobile user registration application where we’d send an SMS message as part of the process, but we had no idea when the response to that message would come in. It could be 2 seconds later, or it could be 2 days later. State information can be highly transient, or very long-lived.

On top of that, there are closely related ideas of “session state” and “application state”. Session state *can be* like the idea of a shopping cart. As I browse items, I add things to my shopping cart. Or I can remove them. Finally, when I’m ready to wrap up my session, I go to checkout and the system needs to calculate the total amount based on the items in my cart.

But it’s a little trickier than that — is my shopping cart really session state, or just another resource in my system that I’m making addressable and exposing through a public interface? Hmmm.

Then there is what I call “application state”. You could also call it “client state”. That’s things like which tab is visible, which buttons are grayed out, is a dialogue box active, or those sorts of things. They usually don’t have anything to do with the state of server-side information, but have to do with the actions and choices that the client has made in interacting with the application client.

But again, you might be dealing with client technology that *can* maintain session state, like javascript in a browser. But you might be dealing with a dumb client that can’t keep track of application state, like plain old HTML pages, and then you have to account for it somehow.

(BTW Every workplace that I’ve ever worked at has struggled with that idea. They kind of treat application state as this magical thing you just get for free. 😉 )

The last type of state I would call out is low-level protocol state. Usually this isn’t even visible to client and server, it’s just taken care of. For example, even HTTP communications go over a TCP socket. And TCP is *not* stateless. The client opens a socket, the server provides a listener, and the client sends an HTTP request. The server responds on the same connection.

The goal is to make sure that even communications that require linked state between client and server do so in a way that provides for retries and fail-overs an the like, so that the stateful communications *appear* to be stateless to folks higher in the stack.

I don’t think you can codify a rule book, because there are no bright lines, and there are no absolute patterns you have to follow. So, just shooting from the hip, here are some thoughts I have about my experiences with handling state:

  • Watch out for single points of failure, where some piece of state information is carried on a single server, such that if the server goes down, the client has to start over from scratch.
  • That is, avoid scenarios that require “server affinity”, or attachment to a particular server in a bank of servers.
  • Think about a bank of caches to handle transient information, accessible by all your application servers.
  • Let the client handle as much of its application state as it can, preferably all of it.
  • Define your session state carefully, and consider what parts of session state you want the client to carry, and what parts you want to just call a “resource” and carry it on the server side.
  • Be very deliberate about make sure you’re providing enough transactional controls to handle the level of contention that you expect for your information.
  • On the flip side, when you have low-contention information, use lightweight mechanisms like optimistic locking to handle concurrent access to the information.
  • It’s usually cleaner to cache transient state in a bank of cache instances than using things like clustering to distribute transient state between your application servers.
  • Be mindful of security! Don’t allow the client to carry along any state information that you don’t want them mucking with and changing! For example, don’t let the client carry the total cost of their shopping cart, at least not without confirming it on the server side at the end.

Here’s a wikipedia article about optimistic locking:

http://en.wikipedia.org/wiki/Optimistic_locking

1 Comment

Filed under computer architecture, opinionizing

One response to “More about State

  1. Pingback: Distributed Weekly 90 — Scott Banwart's Blog

Leave a comment