&yet

All of 2011

Re-using Backbone.js Models on the server with Node.js and Socket.io to build real-time apps

Quick intro, the hype and awesomeness that is Node

Node.js is pretty freakin’ awesome, yes. But it’s also been hyped up more than an Apple gadget. As pointed out by Eric Florenzano on his blog a LOT of the original excitement of server-side JS was due to the ability to share code between client and server. However, instead, the first thing everybody did is start porting all the existing tools and frameworks to node. Faster and better, perhaps, but it’s still largely the same ‘ol thing. Where’s the paradigm shift? Where’s the code reuse?!

Basically, Node.js runs V8, the same JS engine as Chrome, and as such, it has fairly decent ECMA Script 5 support. Some of the stuff in “5” is super handy, such as all the iterator stuff forEach, map, etc. But – and it’s a big “but” indeed – if you use those methods you’re no longer able to use ANY of your code in older browsers, (read “IE”).

So, that is what makes underscore.js so magical. It gives you simple JS fallbacks for non-supported ECMA Script 5 stuff. Which means, that if you use it in node (or a modern browser), it will still use the faster native stuff, if available, but if you use it in a browser that doesn’t support that stuff your code will still work. Code REUSE FTW!

So what kind of stuff would we want to share between client and server?

Enter Backbone.js

A few months ago I got really into Backbone and wrote this introductory post about it that made the frontpage of HN. Apparently, a LOT of other people were interested as well, and rightfully so; it’s awesome. Luckily for us, Jeremy Askenas (primary author of backbone, underscore, coffeescript and all around JS magician) is also a bit of a node guru and had the foresight to make both backbone and underscore usable in node, as modules. So once you’ve installed ‘em with npm you can just do this to use them on the server:

var _ = require('underscore')._,
backbone = require('backbone');

So what?! How is this useful?

State! What do I mean? As I mentioned in my introductory backbone.js post, if you’ve structured your app “correctly” (granted, this my subjective opinion of “correct”), ALL your application state lives in the backbone models. In my code I go the extra step and store all the models for my app in a sort of “root” app model. I use this to store application settings as attributes and then any other models or collections that I’m using in my app will be properties of this model. For example:

var AppModel = Backbone.Model.extend({
defaults: {
attribution: "built by &yet",
tooSexy: true
 },

initialize: {
// some backbone collections
this.members = new MembersCollection();
this.coders = new CodersCollection();

// another child backbone model
this.user = new User();
}
});

Unifying Application State

By taking this approach and storing all the application state in a single Backbone model, it’s possible to write a serializer/deserializer to extract and re-inflate your entire application state. So that’s what I did. I created two recursive functions that can export and import all the attributes of a nested backbone structure and I put them into a base class that looks something like this:

var BaseModel = Backbone.Model.extend({
// builds and return a simple object ready to be JSON stringified
xport: function (opt) {
var result = {},
settings = _({
recurse: true
}).extend(opt || {});

function process(targetObj, source) {
targetObj.id = source.id || null;
targetObj.cid = source.cid || null;
targetObj.attrs = source.toJSON();
_.each(source, function (value, key) {
// since models store a reference to their collection
// we need to make sure we don't create a circular refrence
if (settings.recurse) {
if (key !== 'collection' && source[key] instanceof Backbone.Collection) {
targetObj.collections = targetObj.collections || {};
targetObj.collections[key] = {};
targetObj.collections[key].models = [];
targetObj.collections[key].id = source[key].id || null;
_.each(source[key].models, function (value, index) {
process(targetObj.collections[key].models[index] = {}, value);
});
} else if (source[key] instanceof Backbone.Model) {
targetObj.models = targetObj.models || {};
process(targetObj.models[key] = {}, value);
}
}
});
 }

process(result, this);

return result;
 },

// rebuild the nested objects/collections from data created by the xport method
mport: function (data, silent) {
function process(targetObj, data) {
targetObj.id = data.id || null;
targetObj.set(data.attrs, {silent: silent});
// loop through each collection
if (data.collections) {
_.each(data.collections, function (collection, name) {
targetObj[name].id = collection.id;
Skeleton.models[collection.id] = targetObj[name];
_.each(collection.models, function (modelData, index) {
var newObj = targetObj[name]._add({}, {silent: silent});
process(newObj, modelData);
});
});
 }

if (data.models) {
_.each(data.models, function (modelData, name) {
process(targetObj[name], modelData);
});
}
 }

process(this, data);

return this;
}
});

So, now we can quickly and easily turn an entire application’s state into a simple JS object that can be JSON stringified and restored or persisted in a database, or in localstorage, or sent across the wire. Also, if we have these serialization function in our base model we can selectively serialize any portion of the nested application structure.

Backbone models are a great way to store and observe state.

So, here’s the kicker: USE IT ON THE SERVER!

How to build models that work on the server and the client

The trick here is to include some logic that lets the file figure out whether it’s being used as a CommonJS module of if it’s just in a script tag.

There are a few different ways of doing this. For example you can do something like this in your models file:

(function () {
var server = false,
MyModels;
if (typeof exports !== 'undefined') {
MyModels = exports;
server = true;
} else {
MyModels = this.MyModels = {};
 }

MyModels.AppModel...

})()

Just be aware that any external dependencies will be available if you’re in the browser and you’ve got other <script> tags defining those globals, but anything you need on the server will have to be explicitly imported.

Also, notice that I’m setting a server variable. This is because there are certain things I may want to do in my code on the server that won’t happen in the client. Doing this will make it easy to check where I am (we try to keep this to a minimum though, code-reuse is the goal).

State syncing

So, if we go back to thinking about the client/server relationship, we can now keep an inflated Backbone model living in memory on the server and if the server gets a page request from the browser we can export the state from the server and use that to rebuild the page to match the current state on the server. Also, if we set up event listeners properly on our models we can actually listen for changes and send changes back and forth between client/server to keep the two in sync.

Taking this puppy realtime

None of this is particularly interesting unless we have the ability to send data both ways – from client to server and more importantly from server to client. We build real-time web apps at &yet–that’s what we do. Historically, that’s all been XMPP based. XMPP is awesome, but XMPP speaks XML. While JavaScript can do XML, it’s certainly simpler to not have to do that translation of XMPP stanzas into something JS can deal with. These days, we’ve been doing more and more with Socket.io.

The magical Socket.io

Socket.io is to Websockets what jQuery is to the DOM. Basically, it handles browser shortcomings for you and gives you a simple unified API. In short, socket.io is a seamless transport mechanism from node.js to the browser. It will use websockets if supported and fall back to one of 5 transport mechanisms. Ultimately, it goes all the way back to IE 5.5! Which is just freakin’ ridiculous, but at the same time, awesome.

Once you figure out how to set up socket.io, it’s fairly straightforward to send messages back and forth.

So on the server-side we do something like this on the new connection:

io.on('connection', function(client){
var re = /(?:connect.sid\=)[\.\w\%]+/;
var cookieId = re.exec(client.request.headers.cookie)[0].split('=')[1]
var clientModel = clients.get(cookieId)

if (!clientModel) {
clientModel = new models.ClientModel({id: cookieId});
clients.add(clientModel);
 }

// store some useful info
clientModel.client = client;

client.send({
event: 'initial',
data: clientModel.xport(),
templates:
 });

...

So, on the server when a new client connection is made, we immediately send the full app state:

io.on('connection', function(client) {
client.send({
event: 'initial',
data: appModel.xport()
});
};

For simplicity, I’ve decided to keep the convention of sending a simple event name and the data just so my client can know what to do with the data.

So, the client then has something like this in its message handler.

socket.on('message', function (data) { 
switch (data.event) {
case 'initial':
app.model.mport(data.data);
break;
case 'change'
...
}
});

So, in one fell swoop, we’ve completely synced state from the server to the client. In order to handle multiple connections and shared state, you’ll obviously have to add some additional complexity in your server logic so you send the right state to the right user. You can also wait for the client to send some other identifying information, or whatnot. For the purposes of this post I’m trying to keep it simple (it’s long already).

Syncing changes

JS is built to be event driven and frankly, that’s the magic of Backbone models and views. There may be multiple views that respond to events, but ultimately, all your state information lives in one place. This is really important to understand. If you don’t know what I mean, go back and read my previous post.

So, now what if something changes on the server? Well, one option would be to just send the full state to the clients we want to sync each time. In some cases that may not be so bad – especially if the app is fairly light, the raw state data is pretty small as well. But still, that seems like overkill to me. So what I’ve been doing is just sending the model that changed. So I added the following publishChange method to my base model:

publishChange: function (model, collection) {
var event = {};

if (model instanceof Backbone.Model) {
event = {
event: 'change',
model: {
data: model.xport({recurse: false}),
id: model.id
}
}
} else {
console.log('event was not a model', e);
 }

this.trigger('publish', event);
},

Then added something like this to each model’s init method:

initialize: function () {
this.bind('change', _(this.publishChange).bind(this));
}

So now, we have an event type in this case change and then we’ve got the model information. Now you may be wondering how we’d know which model to update on the other end of the connection. The trick is the id. What I’ve done so solve this problem is to always generate a UUID and set it as the id when any model or collection is instantiated on the server. Then, always register models and collections in a global lookup hash by their id. That way we can look up any model or collection in the hash and just set all our data on it. Now my client controller can listen for publish events and send them across the wire with just an id. Here’s my register function on my base model (warning, it’s a bit hackish):

register: function () {
var self = this;
if (server) {
var id = uuid();
this.id = id;
this.set({id: id});
}
if (this.id && !Skeleton.models[this.id]) Skeleton.models[this.id] = this;

this.bind('change:id', function (model) {
if (!Skeleton.models[this.id]) Skeleton.models[model.id] = self;
});
},

Then, in each model’s initialize method, I call register and I have a lookup:

initialize: function () {
this.register();
}

So now, my server will generate a UUID and when the model is sent to the client that id will be the same. Now we can always get any model, no matter how far it’s nested by checking the Skeleton.models hash. It’s not hard to deduce that you could take a similar approach for handling add and remove events as long as you’ve got a way to look up the collections on the other end.

So how should this be used?

Well there’s are three choices that I see.


  1. Send model changes from either the server or the client in the same way. Imagine we’re starting with an identical state on the server and client. If we now modify the model in place on the client, the publish event would be triggered and its change event would be sent to the server. The change would be set to the corresponding model on the server, which would then immediately trigger another change event, this time on the server echoing back the change to the client. At that point the loop would die because the change isn’t actually different than the current state so no event would be triggered. The downside with this approach is that it’s not as fault tolerant of flaky connections and it’s a bit on the noisy side since each change is getting sent and then echoed back. The advantage of this approach is that you can simply change the local model like you normally would in backbone and your changes would just be synced. Also, the local view would immediately reflect the change since it’s happening locally.


  2. The other, possibly superior, approach is to treat the server as the authority and broadcast all the changes from the server. Essentially, you would just build the change event in the client rather than actually setting it locally. That way you leave it up to the server to actually make changes and then the real change events would all flow from the server to the client. With this approach, you’d actually set the change events you got from the server on the client-side, your views would use those changes to update, but your controller on the client-side wouldn’t send changes back across the wire.


  3. The last approach is just a hybrid of the other two. Essentially, there’s nothing stopping you from selectively doing both. In theory, you can sync the trivial state information for example simple UI state (whether an item in a list is selected or not) using method #1 and then do more important interactions by sending commands to the server.


In my experiments option 2 seems to work the best. By treating the server as the ultimate authority, you save yourself a lot of headaches. To accommodate this I simply added one more method to my base model class called setServer. It builds a change event and sends it through our socket. So now, in my views on the client, when I’m responding to a user action instead of calling set on the model I simply call setServer and pass it a hash of key/value pairs just like I would for a normal set.

setServer: function(attrs, options) {
socket.send({
event: 'set',
id: this.id,
change: attrs
});
}

Why is this whole thing awesome?

It lets you build really awesome stuff! Using this approach we send very small changes over an already established connection, we can very quickly synchronize state from one client to the other or the server can get updates from an external data source, modify the model on the server and those changes would immediately be sent to the connected clients.

Best of all – it’s fast. Now, you can just write your views like you normally would in a Backbone.js app.

Obviously, there are other problems to be solved. For example, it all gets a little bit trickier when dealing with a multiple states. Say, for instance you have a portion of application state that you want to sync globally with all users and a portion that you just want to sync with other instances of the same user, or the same team, etc. Then you have to either do multiple socket channels (which I understand Guillermo is working on), or you have to sync all the state and let your views sort our what to respond to.

Also, there’s persistence and scaling questions some of which we’ve got solutions for, some of which, we don’t. I’ll save that for another post. This architecture is clearly not perfect for every application. However, in the use cases where it fits, it’s quite powerful. I’m neck-deep in a couple of projects where I’m explore the possibilities of this approach and I’ve gotta say, I’m very excited about the results. I’m also working on putting together a bit of a real-time framework built on the ideas in this post. I’m certainly not alone in these pursuits, it’s just so cool to see more and more people innovating and building cool stuff with real-time technologies. I’m thankful for any feedback you’ve got, good or bad.

If you have thoughts or questions, I’m @HenrikJoreteg on twitter. Also, my buddy/co-worker @fritzy and I have started doing a video podcast about this sort of stuff called Keeping It Realtime. And, be sure to follow @andyet and honestly, the whole &yet team for more stuff related to real-time web dev. We’re planning some interesting things that we’ll be announcing shortly. Cheers.


If you’re building a single page app, keep in mind that &yet offers consulting, training and development services. Hit us up (henrik@andyet.net) and tell us what we can do to help.


We’re sponsoring nodeconf?! Heck yes!

Server-side JS is no longer a punchline. It’s the future. And the future is now. We’re in it. Here we go. Yay, future!

If you’d asked most developers 5 years ago, most of them would have said: “Why would anyone want to write JS on the server?!” The luddites still do.

But we, at &yet, have fallen in love with node. Our particular schtick is the real-time web (see our podcast). We’ve been building real-time web apps for a while, mostly with XMPP and Strophe.js. Recently, however we’ve started using node + socket.io.

Frankly, we couldn’t be happier and can’t wait to see what the future holds as these technologies continue to mature.

We’re humbled and excited to get to be a part of what promises to be an amazing conference. Given that it sold out in about 4 minutes, clearly we’re not the only ones!

See you there!

/dev/castle

Our developers and family are all going to live and work in Italy for a month. In a castle. To be specific, this castle:

The Four-Hour Workweek irked me before I decided to read it and blew my mind once I chose to.

Not only that, I blame that book for the fact that our company and our families are now days from packing up our office and moving to Europe for a month to work from a castle on Italy’s Adriatic coast.

Seriously.

The title of the book irritated me.

Why? Because I don’t want to work four hours a week.

I want to throw myself into something I love and believe in, where I can create value and make a difference and learn and grow.

That kind of business is really at the heart of the Four-Hour Workweek. Ferriss tells readers to quit being enslaved to their jobs, create a business to fund their dreams and go for it—now!

Ferriss talks about dreamlining—focusing energy on making some amazing dream a reality as quickly as possible. If that means traveling the world, going on some grand adventure, or learning to be the best in the world at something, Ferriss offers productive paths to create a business that is your “muse”, cut all unnecessary expenses, smash all roadblocks, and go for it!

You really should read the book if you haven’t.

I don’t want to oversimplify what is actually a book crammed tons of thought-provoking ideas, but a big piece of what Ferriss is talking about is making a lifestyle business for yourself.

The term “lifestyle business” gets a bad rap sometimes, but I like how Products Hero Amy Hoy puts it — “A lifestyle business is just a *profitable* business!”

Now, I already have a three-year profitable business that I built from the ground up with just me and the words “I AM DOING THIS” scrawled on a whiteboard.

And I feel fairly confident at this point that if I really wanted to, I could step away and it could support me doing whatever I wanted for a time.

But what if it wasn’t about me? Because it definitely isn’t in my book.

Most importantly, &yet is what it is because of *who* it is. And I’m just one small piece of that. &yet might actually be a lifestyle business, but it isn’t *my* lifestyle business…

And, in fact, I realized that this is actually what &yet is…

It’s a team lifestyle business.

When people ask me to describe &yet, I always think about Nathan’s take on &yet shortly after joining us. He summed it up thusly:

“&yet is a service to its employees.” [1]

So rather than me simply creating a business and a machine that makes me money as the sole owner, it is instead a business whose profits are largely used to empower its employees’ growth, freedom, and enjoyment. And, by the way, I would give ridiculously high marks to our team’s resulting creativity and effective productivity.

But back to Italy.

Yes. We are going to exclusively rent an amazing centuries-old castle on a hill that sleeps 56 and has a view of the ocean on the eastern coast of Italy.

We’re doing it because we want to, because we can, and because we believe the end results of our trip will be overwhelmingly positive.

We’re doing it because we’ve always been fascinated by Simon Willison’s /dev/fort idea. [2]

We’re doing it because we realized it’s *cheaper* to stay in a castle a little bit off the beaten path for a whole month than it is to stay in Rome for a week.

We’re doing it because this is a huge enough opportunity that all of our team members decided we’d be willing to skip one paycheck to make it happen [3]

We’re doing it this year because we want to do something like this next year too.

And I’ll be honest: all of the go-for-it encouragement and assumption-busting in The 4-Hour Workweek pushed it from whim to reality.

So—thanks, Tim! Feel free to pop by the castle. I’d love to introduce my awesome team to you.

And to everyone else: What’s your dream? What’s stopping you from making it happen immediately?

If you’re doing something like this—or even if you want to, we want to hear about it!

We’ll be sharing about our adventure and being darn honest about the experiment’s successes and failures, so be sure to Follow &yet on Twitter and our team who’s going.

———

[1] I have another blog post I want to unpack this statement a little more.

[2] Incidentally, one day we want to do a /dev/train, inspired by our external teammate, Hjon.)

[3] The balance of which the company sneakily added back in to our wages by way of a permanent raise.

Using the REST interface as the JavaScript interface with Fermata

In support of an upcoming &yet product (ssssssh!), I was asked to create a JavaScript wrapper around a REST-based API we’re using from node.js.

If you’ve been there, you might know how it goes: guess which API features the current project actually needs, make up some sort of “native” object representation, implement some bridge code that kinda works, and as a finishing touch, slap a link to the service’s real documentation atop the code you left stubbed out for later.

Or, you find someone else’s wrapper library. They took the time to implement most features, and even wrote their own version of the documentation — but the project they needed it for was cancelled years ago, so their native library still wraps the previous version of the server API, without the new features you need.

FACTS

On one hand, the HTTP REST server offers all the newest features, with official usage documented by the service provider. On the other hand, your JavaScript code should be fluently written, following the native programming language idioms. Can we keep it that way?

It’s a paradox, really:


  • the REST interface is the BEST interface.

  • the best interface is a native interface.

Do you see where this is going? It took me a while, but after wrestling with the design of yet another web service wrapper, I finally saw the whole coin that I’d been flipping. It’s called Fermata, because when I finally put the two sides together I was working from a cheerful Italian caffè and needed a lively but REST-ful word.

In REST you have nouns and verbs — resources and methods — URLs and GET/PUT/POSTs. In JavaScript you have objects and methods — nouns and verbs. So in Fermata, URLs are objects, and methods are, well…methods on those objects:

var rest_server = fermata.api({url:"http://couchdb.example.com:5984"});
var my_document = rest_server.mydata.sample_doc;
my_document.put({title:"Fermata blog post", content:"?"}, function (err, response) { if (!err) console.log("Relax, your data is in good hands."); });

Hey, presto, abracadabra! Pretty simple, eh?

So…is it magic?

Yes, it is magic.

Explain this sleightly hand. Or die.

Easy there, fair Internet reader person! Your dollar was just hiding right there in your ear.

To make the dot syntax work without having to know all the paths available on the server, Fermata uses a feature of an upcoming JavaScript Harmony proposal called catch-all proxies. Proxy-fied objects finally give JavaScript developers a way to intercept all access to an object or function, injecting custom behaviour that would otherwise be impossible.

ECMAScript 5 (the latest JavaScript standard; you can tell it is Web Standard since it ends in 5) let us define property descriptors to handle the actual fetching and storing of pre-declared object keys — that is, you could have custom behaviour for specific properties only if their names were known beforehand:

var myObject = {}
Object.defineProperty(myObject, 'someSpecificProperty', {get: function () { return "someSpecificProperty has this value"; }});
myObject.someSpecificProperty === "someSpecificProperty has this value";
myObject.someOtherProperty === undefined;

ECMAScript Harmony (an in-progress proposal for the next version of JavaScript) wants to take this a step further: rather than just controlling a few pre-defined properties of an object, you can control access to any key — the property’s keyname is handed to a completely generic “get” function trap on the object:

var myObject = Proxy.create({get: function (obj, keyName) { return keyName + " has this value"; }}, {});
myObject.someSpecificProperty === "someSpecificProperty has this value";
myObject.someOtherProperty === "someOtherProperty has this value";

So the subpath keys of a Fermata URL don’t actually exist. (I told you it was magic.) Instead, when you assign var obj = url.path the JavaScript engine calls a proxy “trap” handler function, that Fermata provides, instead: “hey, for key named path on the object url, what should I say the value is?”. Fermata says: “I’ll make a new, slightly longer, URL proxy” and so that’s what the JavaScript engine assigns to var obj. If you then access a property on obj, Fermata just returns yet another object created via Proxy. Smoke and mirrors.

Of course, where there’s smoke and mirrors there must be fire and medicine cabinets. I said ECMAScript Harmony is a proposal that “wants to” standardize Proxy objects in JavaScript — a future version of JavaScript. Fortunately for us impatient types, an intrepid developer named Sam Shull has stocked the node.js medicine cabinet with node-proxy.
While it differs a little from the official Harmony proposal, his V8 Proxy library made Fermata possible. Made Fermata magic.

dramatic pause

But my web browser isn’t magic, yet

Firefox 4’s JavaScript engine implements the new Proxy object feature, but to reliably use Fermata’s magic on the web we’d have to wait for broader support. (Chrome might be next; the race is on, fellas!) In the meantime, I’ve designed Fermata so that anything you can do with dots and brackets you can do with parentheses, and more!

var homebase = fermata.api({url:"", user:"webapp", password:SESSION_ID});
var latestMessages = homebase('api')('user')('messages.json');
latestMessages() === "/api/user/messages.json"; // use empty parens for the URL as a string
latestMessages.get(function (e, messages) { console.log(messages); });

You can also use the parenthesis syntax to pass an array, which is how you prevent the automatic URL component escaping Fermata does normally. Starting from the CouchDB restServer example above:

recent_docs = rest_server('mydata')(['_design/app/_view/by_date'], {reduce:false, descending:true, limit:10});  // keep a view query handy
recent_docs() === "http://couchdb.example.com:5984/mydata/_design/app/_view/by_date?reduce=false&descending=true&limit=10";
recent_docs.get(...you know the drill...);

Volunteer from the audience

I’d encourage you to give Fermata a spin the next time you only want magic cutting between you and your favorite REST service. It’s hosted on github and installable via npm, under the terms of your friendly local MIT License.

After writing and using various REST wrapper interfaces through the years, I’m excited that I can finally speak both fluent HTTP and native JavaScript at the same time. In the office next door, Henrik is already using it from node.js to access several REST service APIs via the one consistent interface Fermata provides. As web applications move more code to the client, and more services implement careful CORS support,
Fermata can provide a high-level AJAX microframework in the browser too.

One next step for Fermata is to add plug-in support for taking care things like of default URLs, setting required headers, converting from XML instead of JSON, and signing OAuth access. The idea is not to wrap the wrapper. More like a musical key signature: do some initial site-specific setup, and the plugin will take care of any API-specific themes while the rest of your JavaScript notation is consistent. Something along the lines of:

var twitter_client = fermata.api({twitter:CLIENT_KEY, user:ID, solemn_developer_promise:"I accept and do acknowledge Tweetie's forever victory, it was a fantastic app while earning its overlord status."});
twitter_client.statuses.user_timeline.get(...); // same ol' Fermata, but plugin is handling OAuth and format stuff

…maybe? Feedback on the plug-in interface, and anything really, is always appreciated!

An Introduction to Thoonk!

A persistent (and fast!) system for push feeds, queues, and jobs, leveraging Redis.

As application developers, we persist data in tables which are constantly updated, leaving most of the application’s components and user-interface in the dark until it asks for the data.

[Movie trailer voice] Imagine a world where these tables push change-events to any piece of your application stack, in diverse languages and on multiple servers.[/Movie trailer voice]

Enter Thoonk.

Clustering Node.js instances, communicating between service components in different languages and on different machines, forking off asynchronous jobs for reliability and queuing of work, communicating between APIs and views, and sending events to real-time webapps are all problems that can be solved with messaging.

Thoonk solves these problems more gracefully than simple messaging because the messages are change-events on persisted data.

Thoonk is a Redis schema for manipulating advanced, live objects (feeds, sorted-feeds, queues, and job-queues, etc). Thoonk is also a couple of implementations of this schema (currently thoonk.js for Node.js and thook.py for Python).

Thoonk is a lot of things, which I will describe, but really what I would like you to get out of this is what the concept is useful for.

A feed is a list of data entries that have publish, edit, retract, and other events associated with those entries. A feed brings to mind ATOM or RSS to most people, but I think feeds are more useful when the associated events are broadcast on publish-subscribe channels so that data can be synchronized. Redis contains both of the necessary components (object storage and publish-subscribe channels).

Thoonk feeds enable our “live tables” fantasy.

Let’s get specific about Thoonk feed-types.


Please refer to the Thoonk.js and Thoonk.py documentation for examples.

The basic feed is a list of items sorted by publish time. Verbs on these objects include publish, edit, and retract. Feeds may be configured to have a max-number of items, which when exceeded, drops the oldest items. Every item may have a unique assigned id, or Thoonk will generate one for you.

Sorted-Feeds are similar to feeds, but they have no item limit (beyond practical memory limitations) and are sorted by publishing items relative to existing item ids. Verbs for sorted-feeds include append, prepend, publishBefore, publishAfter, move, edit, and retract. Sorted-feeds emit position updates when an item is published or moved in addition to publish, edit, and retract events.

Queues contain items that can be placed at the beginning or end, producing FIFO and LIFO queues. A queue get is a blocking operation with an optional timeout that pops an item off of the end. Queues can be used for simple messaging and task distribution.

Job channels distribute items in a guaranteed completion manner. Jobs consist of three queues: available jobs, in-flight jobs, and stalled job. Like queues, jobs can be pushed to the beginning or end of available jobs and getting a job is a blocking operation with a timeout. Job verbs include: publish, retract, get, cancel (place an in-flight job back into available-jobs), stall (place a job out of the way that has been a problem), retry (place a stalled job as available).

Sets will be added in the near future as a means for maintaining live filters/queuries for feeds and other data.

An example Thoonk ecosystem:


Thoonk is a tool which allows you create an Internet service as a wide ecosystem rather than a deep application. Say we provide a series of 8 node.js processes to take advantage of the number of CPU threads available. This node.js application provides a websocket interface to a browser-js application with live events coming from Thoonk feeds on Redis, organized by individual users and teams. In another process, we might run a Ruby service that provides a REST interface for manipulating and querying objects within users and groups. Say also that we want to peer certain data with other services — we can run a Python process which provides XMPP Publish-Subscribe (XEP-0060) and a Java interface which provides a PubsubHubbub interface. In addition to that, background jobs that absolutely have to be done can be pushed through a job system with workers running in C.

All of these separate components subscribe to the feeds pertinent to their function as well as provide relevant ACL and interface to the end-points. You are now free to use the most appropriate tools for the job, distribute load, organize application data, and selectively synchronize state easily. Of course, if you don’t have to have a lot of processes on a lot of servers in a lot of languages, you can still take advantage of compartmentalizing and duplicating your componets.

Backstory


I find Messaging to be an interesting problem, particularly when machines communicate to share state, make requests, etc. However, messaging has limited use without persistent data, which is why I like XMPP Publish-Subscribe (XEP-0060) so much. Feeds of data — combining data-persistence with publish-subscribe events about changes to the data, is incredibly valuable in machine-to-machine communication.

This is something that I’ve been applying to clustering, configuration distribution, job distribution and management, and real-time webapps, and other problems for years now in my consulting work.

Then, I discovered Redis, which is a very fast key-store-with-containers database that also includes publish-subscribe, and I immediately knew what I had to build.

I’m publishing this as MIT because I not only want to share it, but I want your feedback, harsh criticism, and contributions. We need more implementations in other languages, and I’d love to see people publish tools that contribute to Thoonk interfaces. In addition, please point out flaws in the contract.txt (schema) document, show us your extensions and own object types, etc.

Just hit up myself @fritzy and/or Lance Stout @lancestout on twitter, follow the github projects (Thoonk.js and Thoonk.py), and watch http://thoonk.com.

Our team at &yet always seems to find our way to work on interesting things, so be sure to follow us on Twitter for the latest.

-Nathan Fritz, &yet Chief Architect

Using CSS3 to create an image-free Progress Bar

Advanced pseudo-classes and pseudo-elements are not pseudo-amazing, they are actually amazing

I’ve been playing a lot with advanced pseudo-classes and pseudo-elements for a project Aaron and I have been working on.

Originally, I was just going to share it as a blog post, but instead, Aaron and I hacked together as a generator you can use to make a purely CSS3 progress bar, like this:

Here’s how we did it.

Fitt’s Law in meatspace

How we used our workspace’s “edges” to hack ourselves into consistently logging hours

At &yet, we’re always fighting to get ourselves to log hours. We recently came up with a method inspired by Fitt’s Law that’s proven quite effective.

Fitt’s Law, as it applies to interface design, essentially says the smaller and further away a target is, the harder it is to hit.

That’s why we get Apple positioning the OS X menu at the very top of our workspace and the Dock at the bottom. The edge of the screen can be said to have infinite width in the direction the mouse hits it, making it an easy target.

What we’ve realized is that Fitt’s Law can apply in physical space, too. Take time tracking, for example.

We all want to track our time, but nothing we’ve tried has worked. Lots of methods and tools help *some* people, but there are few things that help *everyone*. Nagging and reminders are worthlessly ineffective when it comes to changing behavior. Tools are hit or miss.

We’ve tried lots of things, but this one was 100% successful: over several weeks’ time, the longest any person went without logging their hours was a day and a half—and most logged them daily. That has never happened in three years of trying to solve this.

Here’s how we “Fittsed” time-tracking.

Logging time effectively is a very hard thing. Doing it consistently means a whole lot of tiny actions and a ton of reminders, each individual piece fairly meaningless. To further complicate matters, it’s something that highly productive people naturally want to “background”, not stick in their face while getting stuff done. That means by design, it’s going to be easy to forget.

What if we could associate logging time with the “edge” of our workspace in real life, not the workspace itself?

Our kitchen is central to our office in almost every way. We keep it stocked daily with whatever snacks and drinks each member of our team wants to have available. The coffee’s there and lots of great conversations get started in the kitchen, too. Everyone uses it every day, all day—typically at stopping points in their work.

I know from experience it won’t work to just stick a reminder on my monitor or where I’ll see it as I walk away from my desk. Unlike those, the kitchen is a destination. It’s a hard “edge” in the office interface.

So now we use it as a privilege, not a right. There is a piece of paper taped to the door with a line down the middle and everyone’s names on tiny Post-It Notes. On one side is the word “BANNED” and on the other, “DEBANNED”.

If I haven’t logged my hours within the past 24 hours, my name gets moved from the “DEBANNED” to the “BANNED” list (by Lisa, our office manager). All I have to do to get it moved back is go to my office and spend a moment logging hours.

The centrality of the kitchen means it’s easily enforced by peer pressure and the additional office fun of public “shaming”.

It’s helped along by another edge: the hallway and the kitchen are also the edge of my personal bubble. When we’re out in the hall, we rib each other and tease each other and push each other to succeed on whatever we’re working on together. That edge reminds me not just of the edge of my workspace, but the edge of where “I” end and “we” begin. Not a bad place to be reminded of the intrinsic value of logging hours.

It’s surprised us how well this has worked. In fact, un-banned people walking into the kitchen mid-day have gone back to log their recent hours because of the physical space reminder.

For some time, we have been thinking constantly about ways to improve the user experience of accomplishing big things as a team with minimal intrusion and maximum benefit. Most of the learning we’ve done in this area we’ve been pouring into a product we’ve been building called &! (“and bang”). You can get on the invite list here.


Many thanks to Nate for his critical feedback on this post, without which it might have been even more abstract and made less sense. :) But, hey, you know what they say…

Choosing a template language(s)

So many {{noun}}s, so little time

Template languages: a densely populated land, where not even angels fear to tread.

Let’s rush right in.

Mail merge for web developers.

A template language is mail merge for web developers.

A template language is not a transformational grammar capable of specifying a projection between semantic signals and a diversely formatted symbolic representation.

A template language is for taking crap and pooping it into HTML.

(Pardon my French)

Some template languages are programming languages.

Every template language starts with the basic notion of inserting variable data into a reusable structure.

// C stdio - http://pubs.opengroup.org/onlinepubs/009695399/functions/printf.html
printf("There are %lu lights", num_lights);

# Python - http://docs.python.org/library/stdtypes.html#string-formatting
data = {'verb':"run", 'subject':"refrigerator",}
print "See %(subject)s %(verb)s. %(verb)s, %(subject)s, %(verb)s!" % data

// flatstache.js - https://github.com/natevw/flatstache.js
var data = {thing:"World"};
Flatstache.to_html("Hello, {{ thing }}!", data);

But simple variable replacement isn’t enough for most template languages.

The three examples above are just “string formatting”, not “template expansion”: When num_lights is 1 we should say “light” instead. Maybe we’d like to capitalize a string when we’re using it at the beginning of a sentence. Maybe we want to style every other table row with alternating colors, or add a special border to the first and last list items…

<table border='1'>
<tr><th>Name</th><th>Age</th></tr>
<?
$result = mysql_query("SELECT * FROM people");
while ($row = mysql_fetch_array($result)) {
echo "<tr><td>" . $row['Name'] . "</td><td>" . $row['Age'] . "</td></tr>";
}
?>
</table>

Some template languages become PHP without the plumber’s crack. When they’re well-designed, these empower non-“coders” to dabble in programming logic without giving them to much rope to SQL inject themselves with.

A simpler approach?

Other template languages are just template languages.

These are the ones that talk a lot about separating logic from presentation in their documentation. What they mean is, they expect you’ll be providing data to your templates via a perfectly good programming language. What they mean is, “you shouldn’t expect your template language to do a programming language’s work”.

Usually these other template languages implement only two logic features: loops and conditionals. The only “programming” the template author can do is declare that certain sections of their template get repeated or left out if certain data has been repeated or left out.

What this means

A template combines variable data with a predefined structure. Taking the programming out of the structure means moving that programming to where the data is gathered.

Unfortunately, in situations where development roles are split between separate teams, this means that the “Frontend” team needs to rely more on the “Backend” team’s help.

What else this means

Fortunately, this also means that the “Frontend” team needs to rely more on the “Backend” team’s help. This might have some slight productivity drawbacks, but can lead to overall architectural and performance benefits.

Maybe it’s only because most programmers aren’t people people, but careful “separation of concerns” typically pays dividends wherever it’s applied to software. Fetching and manipulating data at a lower level forces everyone to think specifically about what information a given web page really needs, and how that might effect the simplicity or scalability of the system as a whole. (At last year’s DjangoCon, we heard some interesting stories about “little” template changes causing an order-of-magnitude more database load.)

Now there’s nothing inherently evil about picking a more empowering template language — unless you’re “binding” variables into your SQL queries using string concatenation, in which case Dante would like to interview you for a book he’s working on.

Better to give a good “Frontend” developer any programming features they need in their template language, than trust a bad “Backend” developer with all the power tools and blasting powder of a bona fide programming language.

What to do about it

It should be clear why there are so very very many different “competing” template engines: every one makes different choices about how a template is defined, and also about what a template can define!

So what should we look for in a template language?

The “perfect blend” will vary from team to team, and even from project to project. But overall, a template language should:


  1. Be easy to implement in and access from the languages that matter these days — For &yet right now that means JavaScript, Python and (in a pinch) Objective-C. Your thing might be Ruby and C#, that’s cool too. The obvious point is that if a templating engine would be hard for you to use, it’s the wrong one for you to choose. On the flip side: if your Django/Rails/Express/jQuery/framework-thing makes it easy to use one particular built-in template language, that’s the one for you.


  2. Encourage thoughtful use of source data — A template language shouldn’t tie the hands of its primary users, but it does need to prevent as much fat-fingering as possible. Mistakes like Cross-Site Scripting and SQL injection are unacceptable for most projects. On a few projects, unoptimized database usage could be costly. On all projects, keeping everyone aware of what data is available (and how) is beneficial.


  3. Keep the app as a whole clean and simple — Boring boilerplate code encourages abundant annoying bugs. Any stuff that doesn’t make your function, your page or your project special all needs to be typed correctly anyway. So take the “template language” idea of variable data vs. reusable structure, and apply that to your team’s work as a whole. Your template language, like any language, should help you simply say what you mean.


At &yet we’ve used everything from Django’s template system (a full-featured engine that mimics Python to good effect) to mustache.js (an intentionally simple language with compatible libraries in many programming languages), and other interesting ones like Jade and Haml (which compile indented structures into HTML). In the future we may find something like Knockout useful. In past lives we’ve been known to use XSLT and various PHP and ColdFusion solutions to get the job done.

Remember that in the end, your choice of template language is tiny and insignificant compared to, say, your choice of the people you work for and the people you work with. Each templating engine will have its upsides and downsides; being conscious of its overall approach, goals and shortcomings will help you and your team use it most effectively.


“Capable” isn’t a strategic planning metric

You have a dream.

So, just like every single one of us, you ask, “Do I have what it takes?”

The answer is yes. Every other answer is a lie, an excuse or a distraction. The call itself is enough of an answer.

I consider myself good at a few things, passable at many, and passionate about more. What I’m capable of is completely irrelevant. I’m likely the worst to judge that anyway.

You prove yourself “capable” by simply *doing*.

So do.

/dev/castle: The Movie

Our filmmaker friend Melani Brown made a cool short film about our month-long adventure working from a castle in Italy.



In case you missed it, here’s the original blog post.

Building a perpetual learning machine

…and announcing Tumbleweed Tech!

In the midst of a particularly enjoyable college semester ten years ago, my good friend Eric Cadwell and I joked that a great job would be just going to school full-time for life.

I decided to figure out how to make a career out of it, in one way or another.

On the list of enjoyable things about the years that followed working as a pastor was the constant learning; I enjoy wrestling deeply with theology and its practicality, plus there’s no shortage of learning opportunities dealing with the human dynamics that come with ministry—painful, yes, but certainly plenty.

When I started &yet, I had the idea of building a business around the things that I had spent the bulk of my free-time learning (namely, web development and design). I figured if doing that could make me at least $30k a year, that was good enough. I mean, heck, there’s no school that’ll pay you a net gain of $30k to learn whatever you want!

It’s worked out better than I thought — I’ve improved my design skill and learned a ton about what it takes to make great software. Plus, you can’t beat the opportunity to learn from people you’ve helped teach.

Amy hadn’t designed for web when we added her to our team and now she’s at the top of her class. Her design aesthetic has always been second-to-none, however, and I’ve learned a huge amount from her intentional simplicity. Working with me has made me a much better designer and made me realize how far I have to go.

We spent the first few months of James’s time with us helping get him up to speed with our process and writing high quality HTML and CSS — but more recently instead of teaching him, he’s doing the educating on advanced CSS3 techniques.

But we’re just getting started.

I feel like I’ve only started with what our folks are capable of teaching — and as we’ve crossed the point of being a fun ad hoc group and into being a real company for some time now, our intent is to take advantage of some that stability to formalize our commitment to education.

Thinking about all of this made us realize that if (1)you can build a business of diversely talented people who enjoy teaching and learning and (2) you intentionally make learning a formal part of your work, then essentially what you have is a miniature, ad-hoc university (in its most ideal form).

And if that’s what you have, why stop internally? Why not share it?

So, today, we’re announcing Tumbleweed Tech, our effort to provide a solid alternative to the less-than-modern approach to web development taken at local colleges and universities.

Don’t get me wrong. I love academics and degrees have their place, but sometimes, you just want to know what you need to know to dive in and get things done. Add to that access to some talented, experienced people, and we think it’s a great approach—one that many students will gain from.

Tumbleweed Tech will begin this Fall.

We have a list of potential classes outlined and we’ll be basing our initial offering based on what people want to learn. So drop your name in if you’re interested! We’d love to have you.

Welcome, Shenoa

&yet Community Coordinator

We are excited to add Shenoa Lawrence to the &yet team. She will be serving part-time as &yet's Community Coordinator, beginning last week.

Shenoa has taken a strong leadership role in our local tech community: <!doctype society>, Room to Think (our local coworking movement), and TriConf (a local barcamp &yet helped sponsor last weekend). She’s also in the process of putting together weCreate, a local directory of people, projects, and products that make up our community. Her dedication and contributions have been a major part of the continued success of all of the above.

We want to affirm that dedication and empower her to continue it.

Shenoa is a veteran web developer and designer, and served as a leader of a community she was a part of in the San Francisco Bay Area. Members of our community have huge respect for Shenoa as an individual and as a contributor to the big success of our local dev community.

Since its first days, &yet has invested time and money in helping build our area's designer and developer community. Our team considers it one of the most important things we've been privileged to contribute to.

This is a continuation of those efforts.

We take a realistic view that community is something that emerges from intentionally cultivated soil—there are both mechanic and organic aspects to a good community, and both require hard work.

Since it began in February, <!doctype society> has gained over 70 members and drawn participants from Walla Walla, Yakima, and Spokane—but we know there are many more who should be a part of our local web development, and creative community. And our aspirations for these groups are bigger than mere social gatherings—we want to spark the founding of numerous startups in our area and help provide resources for them to succeed.

We're excited about what Shenoa has contributed so far to that end, thrilled to be able to team up with her further, and eagerly anticipate what's next.

Please thank Shenoa for her hard work and dedication and for being willing to take on this new challenge as a continuity of what she's already helped to build.

Welcome, Melani

Monday will be Melani Brown’s first day as a full-time &yet team member—we can’t wait!

Melani is a talented filmmaker and photographer who will be doing awesome stuff of that sort with us.

She has worked on Kill Bill, Desperate Housewives, Nike commercials, and the online Old Spice social media ad campaign. She has photographed Bon Iver, Sallie Ford & the Sound Outside, and numerous indie bands.

As a longtime friend of the equally talented Amy Lynn Taylor, we were privileged to have Mel provide our team’s photography a couple years ago. We’ve enjoyed several one-off collaborations with her since, including inviting her to participate in our team’s month-long stay in an Italian castle this Spring.

It’s been clear for some time that she’s an unofficial member of our team, more than anything because she comfortably fits our approach and values: she’s talented, creative, passionate, and has an attitude of encouraging those around her to grow and succeed.

In her many years of travels across the globe, she could best be described as an itinerant blesser. We feel blessed to officially make her a part of our team.

In addition to the great short film she made about our Italy adventure, here’s a couple more examples of Mel’s great work:

coding & designing from around the world from Melani Brown on Vimeo.

Pocket Portrait: 02 Ritchie Young from Melani Brown on Vimeo.

A cross-communal conference all about realtime technology – for developers by developers

Realtime is becoming a central part of Internet technology.


It’s sneaking it’s way into our lives already with push notifications, Facebook and Google’s web chats, and it’s a core focus for startups like Convore, Pusher, Superfeedr, Browserling, NowJS, Urban Airship, Learnboost, our own &! (andbang), and many more.


What’s most interesting to me is how accessible this is all becoming for developers. In my presentation at NodeConf I mentioned that the adoption of new technology seems directly related to how easy it is to tinker with it. So, as realtime apps get easier and easier to build, I’m convinced that we’re going to see a whole slew of new applications that tap this power in new, amazing ways.


We at &yet have built five or so realtime apps in the past year, and we’re super excited about this stuff. We’ve also discovered that there are a slew of different methods and tools for building these kinds of apps—we’ve used a number of them. Different developer communities have been solving the same problems with different tools and it’s been amazing to see how much mindblowingly awesome code has been so freely shared. However, there’s still a bit of a disconnect, because it often happens within a given dev community. We always find that we learn the most when we talk to and learn from people who are doing things differently than we are.


So what can we do to encourage more of this?


That’s exactly the conversation Adam and I were having when we went to the XMPP Summit in Brussels, Belgium. That conversation culminated into a crazy idea: We should put on a conference entirely focused on realtime web stuff!


It’s crazy, for a couple of reasons. First, we’ve never organized a conference before and secondly we’re in Eastern Washington, not exactly a tech hotspot (although, we’re working on that too). Luckily we’re fortunate to have made some awesome friends as we’ve attended conferences, written blogposts and worked on pretty cool projects for our clients.


We’re teaming up with Julien Genestoux and Superfeedr to make this all happen. Julien is a pioneer and incredible visionary when it comes to realtime technology. Superfeedr was one of the early startups in the realtime web world. Whether you know it or not, you’ve probably benefitted from superfeedr’s technology while using other services like gowalla, tumblr, etsy, posterous and many more.


Together we’ve manage to line up a ridiculously awesome list of speakers, that we’re gradually announcing. So far, we’ve announced Guilleremo Rauch (creator of socket.io), Leah Culver (founder of Convore and previously pownce.com), James Halliday (JS hacker and creator of dnode, browserify, and a bunch of other awesome stuff under the alias “substack”). Also, we’ve just added realtime veteran Jack Moffit (@metajack) and NowJS’s Sridatta Thatipamala (@sridatta).


Personally, I’m way more excited about attending this event than I am being part of organizing it. These people are my heroes. We’ve got several more really interesting folks on the TBA list as well.


We’ve been getting some great advice from Chris Williams (JSConf‘s daddy) on how to put on a kick-ass conference. We don’t know if we’ll make any money, in fact, our main goal is just to not lose money. We just want to bring together all of these amazing people from various communities that are pushing the envelope of what can be done in a browser. We need to listen to each other, learn from each other and push each other to solve the problems that can make more awesome apps a possibility.


In order for attendees to get the most value possible, we’re going to do a presentation track (on the top floor) and then a hack-track (on the lower floor), where the presenters can do smaller, follow-up sessions, how-to’s, training, etc. Multiple hack-tracks will be going on simultaneously. The goal being for people to be able to get more in-depth knowledge on the topics that interest them most.


We’re also trying hard to get representatives of various dev communities, so that no one stack is touted as the “One True Way”. That’s just silly. We all have our favorites, I get that, but ultimately we’re better off if we learn from each other, especially from those who are not using our tools of choice. There’s a whole batch of new problems to solve in building (and scaling) rich, real-time applications that work on as many devices as possible.


The details


KRTConf will be Nov. 7-8 in Portland, OR, all the details are on krtconf.com and new stuff is being announced as it happens on twitter at @krtconf and on this blog.


If you wanna be there, you can get tickets on eventbrite and if you’re interested in speaking, sponsoring or otherwise being involved in the event, email Adam adam@krtconf.com or myself henrik@krtconf.com or hit us up on twitter @adambrault @henrikjoreteg.


I’m super excited to be a part of this and hopefully I’ll see you there!

Unsafe at Certain Speeds: Dangers Designed into Django

A good software development framework should make the common things easy and make the uncommon things possible.

Unfortunately, Django sometimes makes the simple things easy and the hard things possible — and security is hard!

What Django does well

The Django community does take security very seriously.

The ORM makes it really difficult to expose your app to SQL injection attacks. The template processing system makes it hard to enable cross-site scripting. It takes work to avoid Django’s CSRF protection, and it’d be rare to subvert its well-tested session handling.

Not only that, but Django’s documentation and release notes go the extra mile, discouraging many poor practices and even warning against problems outside of Django that could affect the security of a web app.

Django’s target market

So what’s the problem?

Django has its roots in the publishing industry and got its wings as a basis for sharing-oriented “Web 2.0” sites. When a majority of resources are publicly available, or shared among all logged-in users, it’s possible to focus on securing a few private corners.

What Django’s design considers uncommon is “multitenant” apps — imagine that instead of adding a blog to your company website, you are building a corporate blog–hosting service.

With a single-tenant app, there’s generally some level of trust among all users. Maybe an intern is only supposed to edit customer support documents, but discovers a bug in the custom CMS built on Django that lets him post a funny picture of his boss on the homepage. Sure it was technically the Django-using programmer who let it happen, but it was the intern who betrayed the tenant’s trust.

With multiple tenants, the responsibility of trust is upon the developers. When some computer-savvy ACME Corp employees find a hole that lets them access Wonder Inc’s draft blog posts, they’re just doing their job. If Wonder Inc imagined their exciting product announcements were safe inside your Django app, they won’t care how easy it was to make that security mistake.

What Django makes easy

Most days, most developers are struggling valiantly just to get their code to work. Getting it to “compile”, getting it to “run”, getting it to run “on the production database”. Fixing it to stay running even when a user clicks Y before they click X.

Security is hard because you still have to do all the “just getting it to work”, but you also have to make sure it doesn’t work even if a different user clicks X, fakes Y and then does Z with a little help from A.

Let me make this clear: security mistakes are too common to be a problem of “stupid developers”. Leave the PEBKAC mentality for the poor techs who have to support what they can’t fix — we are developers and designers, busy developing our designs. Engineering, Enforcement and Education are wonderful, but usability is cheaper.

Django’s ORM makes it easy — too easy — to expose database rows to users who shouldn’t have access. It provides a very user-friendly mapping from SQL to model objects. The catch is, the database doesn’t give a rat’s rooty-tooty about your app’s permission model, and neither does the ORM. Its job is to be the floppy disk for your spreadsheets, the ORM’s job is to pretend the spreadsheet rows are documents. Fair enough. But the tools Django provides for validating data access are too difficult to customize for an app where every table is shared among mutually untrusted tenants. Remember that developers are naturally inclined to code until it works for them — not to prove that the same code won’t work when an attacker calls it up.

The template processing and file handling infrastructure encourage developers to expose private user uploads via statically hosted media directories. This is fine for a blog, but when a user notices their private upload got renamed to “/media/user_images/image_______.jpg” they might start figuring out that Apache will gladly let them see “image___.jpg” (and “image.php”!) in that directory too.

Finally, while most of Django’s middleware does enhance web app security, the error debugging system can lead to inadvertent storage of sensitive user data if an exception catches it mid-flight. This issue is being addressed for Django 1.4, although the design is opt-in and may be a bit fragile in practice — but this particular problem is both hard and uncommon. In this last case I suspect the solution being built in is a good enough design.

How to make secure apps more common

That leaves us with Django’s ORM and file handling — which I’m convinced are not good enough designs for a multitenant web app framework.

In a multitenant app it is very common that model lookups and form validation must be contained to a stricter subset of data than Django encourages.

The very best solution to this problem is to partition your app. Give each tenant their own virtual system, their own database — in short, their own copy of the hosted app.Partitioning does take more work to configure up-front, but that’s the best place for investments like that. It also complicates cross-account administration features: which is exactly the point. Make the uncommon use cases the harder ones, so that the normal stuff is securer by default.

If you’re not ready or it’s too late to partition, do your whole team a favor and stop using Django’s ORM and ModelForms directly in a multitenant codebase. You need to write an API and force all your code to use it, instead of the ORM. Django’s views are too presentation-focused. Not the place to expect secure code. When coding up a working user interface, it’s too easy to say “My code needs this object!” when you mean “Some user would like to access this data?”. Give day-to-day development the freedom to wholeheartedly fight For the user. Build an internal Python data access API for the sole purpose of standing between the user request and the ORM or filesystem; a good gate on this border can keep a thousand welcome mats safe.

Whether you partition your app into single-tenant instances or use an API to isolate data access, you should develop tests primarily for security. If a commit breaks functionality, it’s an obvious bug. Someone will complain soon enough. If a code change only adds “functionality” that isn’t supposed to exist, it’s a zero-day. Will you notice the mistake in time?

Interestingly enough, our security tests do tend to catch functionality regressions too, since they really must check that Mallory can’t do something Alice and Bob can.That’s a nice benefit, especially since you’re still updating tests because the app is getting better and its security needs to as well. (Having to maintain tests that only lock in functionality as it continually changes, sucks.)

Focus your programmatic testing efforts on permissions enforcement. Your time is precious — don’t bother with automated tests for anything less valuable than earning trust!

Make boring mistakes hard

Django is a great traditional web framework that makes many customizations easy. It’s possible to build secure multitenant apps using the pieces Django provides, although certain built-in features and certain patterns encouraged by the documentation need to be avoided.

I suspect this is also the case with many other web frameworks. And security might not be the only area where developers’ toolkits make doing things “the wrong way” the easy way.

Pay attention to design decisions at the framework level that distract your team from delivering a great user experience at a higher level.

Avoid shooting yourselves in the foot (feets?) by only picking fights on fronts where the troops will stay engaged. Make solving interesting problems the only uphill battle for your developers. Then level the field for your customers. (That’s what usability is about.)

To see us work it’s not self-evident, but nerds invented computers to avoid tedious mistake-prone work. Like end-users, developers have lives and are busy and are experts only in their own passions. Assume security will be taken for granted by users, and developers alike!

If secure web apps should be common, vulnerable code must be made hard to write. It is a good workman’s responsibility to blame his tools every now and then — occasionally we get something as useful as Django as a result!

Parenting with CSS4. And Vengeance. A Very Special Vodcast.

It’s our first podcast, or maybe &cast, and what a start we’re off to.

James displays a knack for not preparing, being distracted, and wiping sweat off his face. He does, however, know what he’s talking about when it comes to CSS specs. Eric asks James to explain the newly proposed subject selectors, link psuedo-classes and whether or not anyone could become Batman, realistically.

Let us know what you think about the CSS4 proposals and how excited you are about the “parent” selector. Because as you can tell, we’re wicked excited about it over here.

Credits:

Talent”: @ericzanol (left) and @jamesmenera (right).

Video filmed and produced by the awesome Ms. Mel.

An interview with Amber Case, cyborg anthropologist and software futurist

Our &yet team were privileged to get to sit down and chat with one of the young visionaries of modern software, Amber Case, Cyborg Anthropologist and Geoloqi founder.

Thanks to Pie and Wieden+Kennedy.

Music by Dat’r
Film by Melanie Brown

We shipped an app that requires WebSockets. Here’s why:

Last week we launched our newest product, &!, at KRTConf. It’s a realtime, single-page app that empowers teams to bug each other less and get more done as a team.

One of our speakers, Scott Hanselman from Microsoft tried to open the app in IE9 and was immediately redirected to a page that tells users they need WebSockets to use the app. He then wrote a post criticizing this choice, his argument being that users don’t care about the underlying technology, they just want it to work. He thinks we should provide reasonable fallbacks so that it works for as wide of an audience as possible.

I completely agree with his basic premise: users don’t care about the technology.

Users care about their experience.

I think this is something the web has ignored for far too long so I’ll say it again:

Users only care about their experience.

In this case, we’re not building a website with content. We’re building an experience.

We didn’t require Web Sockets because we’re enamored with the technology, we actually require it precisely because it provides the best user experience.

The app simply doesn’t feel as responsive when long-polling. There’s enough of a difference in lag and responsiveness that we made the choice to eliminate the other available transports in Socket.io. (We’re doing a lot more with our data transport than simply sending chats.) Additionally, we’re also using advanced HTML5 and CSS3 that simply isn’t available yet in IE9. It turns out that checking for WebSockets is a fairly good litmus test of the support of those other features (namely CSS3 transitions and animations). The app is just plain more fun to use because of those features.

Apple beat Microsoft by focusing on user experience. They unapologetically enforced minimum system requirements and made backward incompatible changes. Why is it considered “acceptable” to require minimum hardware (which costs money), but it’s somehow not acceptable to require users to download a free browser?

I’ve said this over and over again: web developers who are building single-page applications are in direct competition with native applications.

If we as web developers continue to limp along support for less-than-top-notch browsers, the web will continue to lose ground to the platforms that build for user experience first. Why should we, as a small bootstrapped company invest our limited resources building less-than-ideal fallbacks?

All this, of course, depends on your audience. We created &! for small, forward-thinking teams, not necessarily their moms. :)

Backbone.js and Capsule and Thoonk, oh my! A scalable realtime architecture

This last year, we’ve learned a lot about building scalable realtime web apps, most of which has come from shipping &bang.

&bang is the app we use to keep our team in sync. It helps us stay on the same page, bug each other less and just get stuff done as a team.

The process of actually trying to get something out the door on a bootstrapped budget helped us focus on the most important problems that needed to be solved to build a dynamic, interactive, real-time app in a scaleable way.

A bit of history

I’ve written a couple of posts on backbone.js since discovering it. The first one introduces Backbone.js as a lightweight client-side framework for building clean, stateful client apps. In the second post I introduced Capsule.js. Which is a tool that I built on top of Backbone that adds nested models and collections and also allows you to keep a mirror of your client-side state on a node.js server to seemlessly synchronize state between different clients.

That approach was great for quickly prototyping an app. But as I pointed out in that post, that’s a lot of in memory state being stored on the server and simply doesn’t scale very well.

At the end of that post I hinted at what we were aiming to do to ultimately solve that problem. So this post is meant to be a bit of an update on those thoughts.

Our new approach

Redis is totally freakin’ amazing. Period. I can’t say enough good things about it. Salvatore Sanfilippo is a god among men, in my book.

Redis can scale.

Redis can do PubSub.

PubSub just means events. Just like you can listen for click events in Javascript in a browser you can listen for events in Redis.

Redis, however is a generic tool. It’s purposely fairly low-level so as to be broadly applicable.

What makes Redis so interesting, from my perspective, is that you can treat it as a shared memory between processes, languages and platforms. What that means, in a practical sense, is that as long as each app that uses it interacts with it according to a pre-defined set of rules, you can write a whole ecosystem of functionality for an app in whatever language makes the most sense for that particular task.

Enter Thoonk

My co-worker, Nathan Fritz, is the closest thing you can get to being a veteran of realtime technologies.

He’s a member of the XSF council for the XMPP standard and probably wrote his first chat bot before you knew what chat was. His Sleek XMPP Python library is iconic in the XMPP community. He has a self-declared un-natural love for XEP-60 which describes the XMPP PubSub standard.

He took everything he learned from his work on that standard and built Thoonk. (In fact, he actually kept the PubSub spec open as he built the Javascript and Python implementations of Thoonk.)

What is Thoonk??

Thoonk is an abstraction on Redis that provides higher-level datatypes for a more approachable interface. Essentially, staring at Redis as a newbie is a bit intimidating. Not that it’s hard to interface with, it’s just kind of tricky to figure out how to logically structure and retrieve your data. Thoonk simplifies that into a few data-types that describe common use cases. Primarly “feeds”, “sorted feeds”, “queues” and “jobs”.

You can think of a feed as an ad-hoc database table. They’re “cheap” to create and you simply declare them to make them or use them. For example, in &bang, we have all our users in a feed called “users” for looking up user info. But also, each user has a variety of individual feeds. For example, they have a “task” feed and a “shipped” feed. This is where it veers from what people are used to in a relational database model, because each user’s tasks are not a part of a global “tasks” feed. Instead, each user has a distinct feed of tasks because that’s the entity we want to be able to subscribe to.

So rather than simply breaking down a model into types of data, we end up breaking things into groups of items (a.k.a. “feeds”) that we want to be able to track changes to. So, as an example, we may have something like this:

// our main user feed
var userFeed = thoonk.feed('users');

// an individual task feed for a user
var userTaskFeed = thoonk.sortedFeed('team.andyet.members.{{memberID}}.tasks');

Marrying Thoonk and Capsule

Capsule was actually written with Thoonk in mind. In fact that’s why they were named the way they did: You know these lovely pneumatic tube systems they use to send cash to bank tellers and at Costco? (PPSHHHHHHHTHOONK! And here’s your capsule.)

Anyway, the integration didn’t end up being quite as tight as we had originally thought but it still works quite well. Loose coupling is better anyway right?

The core problem I was trying to solve with Capsule was unifying the models that are used to represent the state of the app in the browser and the models you use to describe your data on the server—ideally, not just unifying the data structure, but also letting me share behavior of those objects.

Let me explain.

As I mentioned, we recently shipped &bang. It lets a group of people share their task lists and what they’re actively working on with each other.

It spares you from a lot of “what are you working on?” conversations and increases accountability by making your work quite public to the team.

It’s a realtime, keyboard-driven, web app that is designed to feel like a desktop app. &bang is a node.js application built entirely with the methods described here.

So, in &bang, a team model has attributes as well as a couple of nested backbone collections such as members and chat messages. Each member has attributes and other nested collections, tasks, shipped items, etc.

Initial state push

When a user first logs in we have to send the entire model state for the team(s) they’re on so we can build out the interface (see my previous post for more on that). So, the first thing we do when a user logs in is subscribe them to the relevant Thoonk feeds and perform the the initial state transfer to the client.

To do this, we init an empty team model on the client (a backbone/capsule model shared between client/server) . Then we recurse through our Thoonk feed structures on the server to export the data from the relevant feeds into a data structure that Capsule can use to import that data. The team model is inflated with the data from the server and we draw the interface.

From there, the application is kept in sync using events from Thoonk that get sent over websockets and applied to the client interface. Events like “publish”, “change”, “retract” and “position”.

Once we got the app to the point where this was all working, it was kind of a magical moment, because at this point, any edits that happen in Thoonk will simply get pushed out through the event propagation all the way to the client. Essentially, the inteface that a user sees is largely a slave to the server. Except, of course, the portions of state that we let the user manipulate locally.

At this point, user interactions with the app that change data are all handled through RPC calls. Let’s jump back to the server and you’ll see what I mean.

I thought you were still using Capsule on the server?

We do, but differently, here’s how that is handled.

In short… it’s a job system.

Sounds intimidating right? As someone who started in business school, then gradually got into front-end dev, then back-end dev, then a pile of JS, job systems sounded scary. In my mind they’re for “hardcore” programmers like Fritzy or Nate or Lance from our team. Job systems don’t have to be that scary.

At a very high level you can think of a “job” as a function call. The key difference being, you don’t necessarily expect an immediate result. To continue with examples from &bang: a job may be to “ship a task”. So, what do we need to know to complete that action? We need the following:

  • member Id of the user shipping the task
  • the task id being completed (we call this “shipping”, because it’s cooler, and it’s a reminder a reminder that finishing is what’s important)

We can derive everything else we need from those key pieces of information.

So, rather than call a function somewhere:

shipTask(memberId, taskId)

We can just describe a job as a simple JSON object:

{
userId: <user requesting the job>,
taskId: <id of task to 'ship'>,
memberId: <id of team member>
}

The we can add that to our “shipTask” job queue like so:

thoonk.job('shipTask').put(JSON.stringify(jobObject));

The cool part about the event propagation I talked about above is we really don’t care so much when that job gets done. Obviously fast is key, but what I mean is, we don’t have to sit around and wait for a synchronous result because the event propagation we’ve set up will handle all the application state changes.

So, now we can write a worker that listens for jobs from that job queue. In that worker we’ll perform all the necessary related logic. Specifically stuff like:

  • Validating that the job is properly formatted (contains required fields of the right type)
  • Validating that the user is the owner of that task and is therefore allowed to “ship” it.
  • Modifying Thoonk feeds accordingly.

Encapsulating and reusing model logic

You’ll notice that part of that list requires some logic. Specifically, checking to see if the user requesting the action is allowed to perform it. We could certainly write that logic right here, in this worker. But, in the client we’re also going to want to know if a user is allowed to ship a given task, right? Why write that logic twice?

Instead we write that logic as a method of a Capsule model that describes a task. Then, we can use the same method to determine whether to show the UI that lets the user perform the action in the browser as we use on the back end to actually perform the validation. We do that by re-inflating a Capsule model for that task in our worker code and calling the canEdit() method on it and passing it the user id requesting the action. The only difference being, on the server-side we don’t trust the user to tell us who they are. On the server we roll the user id we have for that session into the job when it’s created rather then trust the client.

Security

One other, hugely important thing that we get by using Capsule models on the server is some security features. There are some model attributes that are read only as far a the client is concerned. What if we get a job that tries to edit a user’s ID? In a backbone model if I call:

backboneModelInstance.set({id: 'newId'});

That will change the ID of the object. Clearly that’s not good in a server environment when you’re trusting that to be a unique ID. There are also lots of other fields you may want on the client but you don’t want to let users edit.

Again, we can encapsulate that logic in our Capsule models. Capsule models have a safeSet method that assumes all inputs are evil. Unless an attribute is whitelisted as clientEditable it won’t set it. So when we go to set attributes within the worker on the server we use safeSet when dealing with untrusted input.

The other important piece of securing a system that lets users indirectly add jobs to your job system is ensuring that the job you receive validate your schema. I’m using a node implementation of JSON Schema for this. I’ve heard some complaints about that proposed standard, but it works really well for the fairly simple usecase I need it for.

A typical worker may look something like this:

workers.editTeam = function () {
var schema = {
type: "object",
properties: {
user: {
type: 'string',
required: true
},
id: {
type: 'string',
required: true
},
data: {
type: 'object',
required: true
}
}
};
editTeamJob.get(0, function (err, json, jobId, timeout) {
var feed = thoonk.feed('teams'),
result,
team,
newAttributes,
 inflated;

async.waterfall([
function (cb) {
// validate our job
validateSchema(json, schema, cb);
},
function (clean, cb) {
// store some variables from our cleaned job
result = clean;
team = result.id;
newAttributes = result.data;
verifyOwnerTeam(team, cb);
},
function (teamData, cb) {
// inflate our capsule model
inflated = new Team(teamData);
// if from the server user normal 'set'
inflated.safeSet(newAttributes);
},
function (cb) {
// do the edit, all we're doing is storing JSON strings w/ ids
feed.edit(JSON.stringify(inflated.toJSON()), result.id, cb);
}
], function (err) {
var code;
if (!err) {
code = 200;
logger.info('edited team', {team: team, attrs: newAttributes});
} else if (err === 'notAllowed') {
code = 403;
logger.warn('not allowed to edit');
} else {
code = 500;
logger.error('error editing team', {err: err, job: json});
}
// finish the job
editTeamJob.finish(jobId, null, JSON.stringify({code: code}));
// keep the loop crankin'
process.nextTick(workers.editTeam);
});
});
};

Sounds like a lot of work

Granted, writing a worker for each type of action a user can perform in the app with all the related job and validation is not an insignificant amount of work. However, it worked rather well for us to use the state syncing stuff in Capsule while we were still in the prototyping stage, then converting the server-side code to a Thoonk-based solution when we were ready to roll out to production.

So why does any of this matter?

It works.

What this ultimately means is that we now push the system until Redis is our bottleneck. We can spin up as many workers as we want to crank through jobs and we can write those workers in any language we want. We can put our node app behind HA proxy or Bouncy and spin up a bunch of ‘em. Do we have all of this solved and done? No. But the core ideas and scaling paths seem fairly clear and doable.

[update: Just to add a bit more detail here, from our tests we feel confident that we can scale to tens of thousands of users on a single server and we believe we can scale vertically after doing some intelligent sharding with multiple servers.]

Is this the “Rails of Realtime?”

Nope.

Personally, I’m not convinced there ever will be one. Even Owen Barnes (who originally set out to build just that with SocketStream) said at KRTConf: “There will not be a black box type framework for realtime.” His new approach is to build a set of interconnected modules for structuring out a realtime app based on the unique needs of its specific goals.

The kinds of web apps being built these days don’t fit into a neat little box. We’re talking to multiple web services, multiple databases, and pushing state to the client.

Mikeal Rogers gave a great talk at KRTConf about that exact problem. It’s going to be really, really hard to create a framework that solves all those problems in the same way that Rails or Django can solve 90% of the common problems with routes and MVC.

Can you support a BAJILLION users?

No, but a single Redis db can handle a fairly ridiculous amount of users. At the point that actually becomes our bottleneck, (1) we can split out different feeds for different databases, and (2) we’d have a user base that would make the app wildly profitable at that point—certainly more than enough to spend some more time on engineering. What’s more, Salvatore and the Redis team are putting a lot of work into clustering and scaling solutions for Redis that very well may outpace our need for sharding, etc.

Have you thought about X, Y, Z?

Maybe not! The point of this post is simply to share what we’ve learned so far.

You’ll notice this isn’t a “use our new framework” post. We would still need to do a lot of work to cleanly extract and document a complete realtime app solution from what we’ve done in &bang—particularly if we were trying to provide a tool that can be used to quickly spin up an app. If your goal is to find a tool like that, definitely check out what Owen and team are doing with SocketStream and what Nate and Brian are doing with Derby.

We love the web, and love the kinds of apps that can be built with modern web technologies. It’s our hope that by sharing what we’ve done, we can push things forward. If you find this post helpful, we’d love your feedback.

Technology is just a tool, ultimately, it’s all about building cool stuff. Check out &bang and follow me @HenrikJoreteg, Adam @AdamBrault and the whole @andyet team on the twitterwebz.


If you’re building a single page app, keep in mind that &yet offers consulting, training and development services. Hit us up (henrik@andyet.net) and tell us what we can do to help.

Realtime web app architecture with Thoonk: a series of tubes, not tables

Now you’re thinking with feeds!

When I look at a single-page webapp, all I see are feeds; I don’t even see the UI anymore. I just see lists of items that I care about. Some of which only I have access to and some of which other groups have access to. I can change, delete, re-position, and add to the items on these feeds and they’ll propagate to the people and entities that have access to them (even if it is just me on another device or at a later date).

I’ve seen it this way for years, but I haven’t grokked it enough to articulate what I was seeing until now.

What Thoonk Is

Thoonk is a series of higher-level objects built on Redis that sends publish, edit, delete, and position events when they are changed. These objects are feeds for making real-time applications and feed services.

What is a Thoonk feed?

A Thoonk feed is a list of indexed data objects that are limited by topic and by what a single entity might subscribe to. An RSS/ATOM feed qualifies. What makes a Thoonk feed different from a table? A table is limited to a topic, but lacks single entity interest limitations. A Thoonk feed isn’t just a message broker, it’s a database-store that sends out events when the data changes.

Let’s use &bang as an example. Each team-member has a list of tasks. In a relational database we might have a table that looks like this:

team_member_tasks
id | team_id | member_id | description | complete bool | etc.

Whenever a user renders their list, I would query that list, limiting by a specific user and a specific team.

If we converted this table, without changing it, into a Thoonk feed, then we would only be able to subscribe to ALL tasks and not just the tasks of a particular team or member. So, instead, a Thoonk feed might look like:


team:<team_id>:member:<member_id>:tasks
{description: "", completed: false, etc, etc}

Now when the user wants a rendered list of tags, I can do one index look-up rather than three, and I am able to subscribe to changes on the specific team member’s tasks, or even to team:353:member:*:tasks to subscribe to all of that team’s tasks.

[Note: I suppose you could arrange a relational database this way, but it wouldn’t really be able to take advantage of SQL, nor could you subscribe to the table to get changes.]

It’s Feeds All the Way Up

If I use Thoonk subscribe-able feeds as my data-storage engine, life gets so much easier. When a user logs in, I can subscribe contextualized callbacks just for them to the feeds of data that they have access to read from. This way, if their data changes for any reason, by any process, by any server, it can bubble all the way up to the user without having to run any queries. I can also subscribe separate processes that can automatically scrub, pre-index, cull, or any number of tasks to any Thoonk feed a particular process cares about. I can use processes in mixed languages to provide monitoring and additional API’s to the feeds.

But What About Writes?

Let’s not think in terms of writes. Writes are just changes to feed items (publishing, editing, deleting, repositioning) that writes the data to ram/disk and informs any subscribers of the change. Let’s instead think in terms of user-actions. A user-action (such as delegating a task to another user in &bang) needs ACL and may affect multiple feeds in a single call. If we defer user-actions to jobs (a special kind of Thoonk feed), we can easily isolate, scale, share, and distribute the business-logic involved in dealing with a user-action.

What Are Thoonk Jobs?

Thoonk Jobs are items that represent business-logic needing to be done reliably, a single time, by any available worker. Jobs are consumed as fast as a worker-pool can consume them. A job feed is a list of job items, each of which may exist in the state of available, in-flight, and stalled. Available jobs are taken and are placed in an in-flight set while they are being processed. When the job is done, the job is removed from the in-flight set, and its item is deleted. If the worker fails to complete the job (either because of an error, distaste, or a monitoring process deciding that the job has timed out), the job may be placed back to the available list or the stalled set.

Why use Thoonk Jobs for User-Actions?

  • User-actions that fail for some reason can be retried (you can also limit the # of retries).
  • The work can be distributed across processes and servers.
  • User-actions can burst much faster than the workers can handle them.
  • A user-action that ultimately fails can be stalled, where an admin is informed to investigate and potentially edit and/or retry when the issue that caused it has been resolved or to test said resolution.
  • Any process in any language can contribute jobs (and get results from them) without having to re-implement the business logic or ACL.

The Last One is a Doozy

Scaling, reliability, monitoring and all of that is nice, but being able to build your application out rather than up is, I believe, the greatest reason for this approach. &bang is written in node.js, but if I have a favorite library for implementing a REST interface or an XMPP interface written in Python or Ruby (or any other language), I can quickly put that together and add it as a process. In fact, I can pretty much add any piece of functionality as a process without having to reload the rest of the application server, and really isolate a feature as its own process. User-actions from this process can be published to Thoonk Job feeds without having to worry about request validation or ACL since that is handled by the worker itself.

Rather than having a very large, complex application, I can have a series of very small processes that automatically cluster and are informed of changes in areas of their specific concerns.

Scaling Beyond Redis

Our testing indicates that Redis will not be a choke point until we have nearly 100,000 active users. The plan to scale beyond that is to shard &bang by teams. A quick look-up will tell us which server a team resides on, and users and processes can subscribe callbacks to connections on those servers. In that way, we can run many Redis servers, and theoretically scale vertically. High-availability is handled by a slave for each shard and a gossip protocol for promoting slaves.

Conflict Resolution and Missed Updates

Henrik’s recent post spawned a couple of questions about conflict resolution. First I’ll give a deflection, and then I’ll give a real answer.

&bang doesn’t yet need conflict resolution. None of the writes are actually done on the client as they are all RPC calls which go into a job queue. Then the workers validate the payload, check the ACL, and update some feeds, at which point the data bubbles back up to the client. The feed updates are atomic, and happen quite quickly. Also, two users being able “to edit the same item only comes up with delegated task, in which case the most recent edit wins.

Ok, now the real answer. Thoonk is going to have revision history and incrementing revision numbers for 1.0. Each historical item is the same as the publish/edit/delete/reposition updates that are sent via pubsub. When a user change job is done, the client can send its current revision numbers for the feeds involved, and thus conflicts on an edit can be detected. The historical data should be enough data to facilitate some form of conflict resolution (determined by the application implementer). The revision numbers can also bubble up to the client, so the client can detect missing updates and ask for a replay from a given revision number.

Currently we’re punting on missed items. Anytime the &bang user is disconnected, the app is disabled and refreshed when it is able to reconnect. A more elaborate solution using the new Thoonk features I just listed is probably coming and perhaps some real offline-mode support with local “dirty” changes that get resolved when you come back online.

All Combined

Using Thoonk, we were able to make &bang scale to 10s of thousands of active users on a single server, burst user-activity beyond our choke-points, isolate user-action business-logic and ACL, automatically cluster to more servers and processes, choose any Redis client library supported language for individual features and interfaces, bubble data changes all the way up to the user regardless of the source of change, provide an easy way of iterating, and generally create a kick-ass, realtime, single-page webapp.

Can I Use Thoonk Now?

Thoonk.js and Thoonk.py are MIT licensed, and free to use. While we are using Thoonk.js in production and it is stable there, the API is not final. Currently I’m moving the the feed logic to Redis Lua scripts, which will be officially supported in Redis 2.6 with an RC1 promised for this December. I plan to be ready for that. The Lua scripting will give us performance gains, and remove unnecessary extra logic to keep publish/edit/delete/reposition commands atomic, but most importantly it will allow us to share the core code with all implementations of Thoonk, allowing us to easily add and support more languages. As mentioned previously, as I do the Redis Lua scripting, I’ll be adding revision history and revision numbers to feeds, which will facilitate conflict detection and replay of missed events.

That said, feel free to comment, contribute, steal, or abuse the project in the meantime. A 1.0 release will indicate API stability, and I will encourage its use in production at that point. I will soon be breaking out the Lua scripts to their own git repo for easy implementation.

If you want to keep an eye on what we’re doing, follow me @fritzy and @andyet on twitter. Also be sure to check out &bang for getting stuff done with your team.


If you’re building a single page app, keep in mind that &yet offers consulting, training and development services. Shoot Henrik an email (henrik@andyet.net) and tell us what we can do to help.

Adam Baldwin and Nathan LaFreniere are yetis.

Security expert and dev/ops badass join the &yet team January 1

Because we are huge fans of human namespace collisions and amazing people, we’re adding two new members to our team: Adam Baldwin and Nathan LaFreniere, both in transition from nGenuity, the security company Adam Baldwin co-founded and built into a well-respected consultancy that has advised the likes of GitHub, AirBNB, and LastPass on security.


We have relied on Adam and Nathan’s services through nGenuity to inform, improve, and check our development process, validating and invalidating our team’s work and process, providing education and correction along the way. We are thrilled to be able to bring these resources to bear with greater influence, while providing Adam Baldwin with the authority to improve areas in need of such.

Adam Baldwin

Adam Baldwin has served as &yet’s most essential advisor since our first year, providing me with confidence in venturing more into development as an addition to my initial web design freelance business, playing “panoptic debugger” when I struggled with it, helping us establish good policy and process as we built our team, improving our system operations, and always, always, bludgeoning us about the head regarding security.

It really can’t be expressed how much respect I and our team at &yet have for Adam and his work.

He’s uncovered Basecamp vulnerabilities that encouraged 37Signals to change their policies for handling reported vulnerabilities, found huge holes in Sprint/Verizon MiFi (that made for one of the most hilarious stories I’ve been a part of), published vulnerabilities *twice* to root Rackspace, shared research to uberhackers at DEFCON, and has provided security advice for a number of first-class web apps, including ones you’re using today and conceivably right now.

Adam Baldwin will be joining our team at &yet as CSO—it’s a double title: Chief of Software Operations and Chief Security Officer.

Adam will be adding his security consultancy, alongside &yet’s other consulting services, but will also be overseeing our team’s software processes, something he has informed, shaped, and helped externally verify since, I think, before most of our team was born.

On a personal note (a longer version of which is here), I must say it’s a real joy to be able to welcome one of my best friends into helping lead a business he helped build as much as anyone our team.

Nathan LaFreniere

As excited as I am personally to add Adam Baldwin, our dev team is even more thrilled about adding Nathan, whose services we have become well accustomed to relying on in our contract with nGenuity and in a large project where we’ve served a mutual customer.

Nathan is a multitalented dev/ops badass well-versed in automated deployment tools.

He solves operations problems with a combination of experience, innovation, and willingness to learn new tools and approaches.

He’s already gained a significant depth of experience building custom production systems for Node.js, including some tools we’ve come to rely on heavily for &bang.

Nathan’s passion for well-architected, smoothly running, and meticulously monitored servers has helped our developers sleep at night, very literally.

I know getting the luxury of having a huge amount of Nathan’s time at our developers disposal sounds to them like diving into a pool of soft kittens who don’t mind you diving on them and aren’t hurt at all by it either oh and they’re declawed and maybe wear dentures but took them out.

So that’s what we have for you today.

We think you’re gonna love it.

Follow us on Twitter

Who's &yet?

We're a crazy fun team who love tackling projects that scratch our collective creative itch.

Giving us a challenging problem to solve is like Ma ringing the jangly triangle thing to announce dinner and whatnot.

Ridiculous? Probably.

Find out more about us... if you dare.

Tag categories

andbang, architecture, awesome, backbone, casts, css3, devops, django, education, film, henrik, html5, interview, italy, javascript, nate, new hires, new office, node, node.js, nodejs, npm, ops, planning, podcast, process, qa, realtime, redis, scaling, security, templates, thanks, thoonk, tumbleweed tech, twitter, vodcast, web design, websockets, work

Post archives

We make web software for human people.
(And have a nearly inappropriate amount of fun doing it.)

Blog feed Follow us on Twitter