Software Development

API Design: We’ll let the users sort it out

In my previous post, I explained an API design that give the user the option to perform an immediate operation, use the default fire and forget or use an explicit bulk mechanism. The idea is that most operations are small, and that the cost of actually going over the network is going to dominate the cost of the entire operation. In this case, we want to give the user the option of selecting the “I want to know the operation completed” or “I just want to try the best, I’m fine if there is a failure” modes.

Eli asks:

Trying to understand why this isn’t just a scripting case. In your example you loop over CSV and propose an increment call per part which gets batched up and committed outside the user’s control. Why not define a JavaScript on the server that takes an array (“a batch”) of parts sent over the wire and iterates over it on the server instead? This way the user gets fine grained control over the batch. I’m guessing the answer has something to do with your distributed counter architecture…

If I understand Eli properly, the idea is that we’ll just expose an API like this:

Increment(“users/1/visits”, 1);

And provide an endpoint where a user can POST a JavaScript code that will call this. The idea is that the user will be able to decide whatever to call this once, or send a whole batch of updates in one go.

This is certainly an option, but in my considered opinion, it is a pretty bad one. It has nothing to do with the distributed architecture, it has to do with the burden we put on the user. The actual semantics of “go all the way to the server and confirm the operation” vs “let us do a bulk insert kind of thing” are pretty easy. Each of them has a pretty well defined behavior.

But what happens when you want to do an operation per page view? From the point of view of your code, you are making a single operation (incrementing the counters for a particular entity). From the point of view of the system as a whole, you are generating a whole lot of individual requests that would be much better off as a single bulk request.

Having a scripting endpoint gives the user the option of doing that, sure, but then they need to handle:

  • Error recovery
  • Multi threading
  • Flushing on batch size / time
  • Failover

And many more that I’m probably forgetting. By providing the users with the option of making an informed choice about speed vs. safety, we avoid putting the onus of the actual implementation on them.

Related Articles

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button