How to write an API that does make sense


Writing APIs is part of the everyday job of a developer and since it is a so important task, in this document, I try to collect same sparse thoughts on how to design an API that does "make sense" for the client-code's developer point of view. The two main take outs are "abstraction" and "contract". And the accent is on the letter "I" of API.


I promised myself to write this document many years ago. Year after year, I delayed the writing because I was under the impression of not being ready yet. Maybe, it was true. Anyway, in all those years, I wrote this document many times in my head without writing that down and, thinking about that, it was probably the wrong thing since I could have had improved it with time but, unfortunately, that's the way it went. Let's start then.

This document is not about Web APIs or about the technology behind an APIs (i.e., loading and linking). This is about the "I" in the acronym: interface. I am not going to talk about Web APIs at all and not even about some specific APIs, excluding the examples. I am going to talk about what you should do to create a usable interface that makes sense using and that, primarily, is simple and intuitive to use.

Recently — some years or so — the term API became a synonym for Web API. This is because everything is becoming a Web Service of some kind and interacting with it requires some interface: the term API was already there and why not reusing it?

What's an API?

In simple terms an API is a contract between a provider of some service (library, web application…) and a client of those services (an application or other libraries). In some less simple terms, an API is an abstraction over some computational services and functionalities that enable a client code to use such services in a "reasonable" way making those services reusable many times and freeing the programmer of the client code from rewriting those functionalities from scratch every time. APIs are a key feature of software development and a good practice in software engineering.

The key concepts in the paragraph above is "abstraction" and "contract". "Abstraction" because the API needs to hide the gory details going on in the service code. "Contract" because each one of the "functions" that compose the API provides some functionality that, taken in input some parameters and — optionally — taken into account the current state, provides an output.

I generally talk of "service" to describe what is hidden by an API. Those services are collections of functionalities that can be categorized in two families: the libraries and the frameworks.

A library is a collection of disjoined, or partially disjoined, functions and data-structures. The reason why they are in the same library is because they are "similar" or because they belong to the same "domain". A good example of those is the math library coming together with your favourite language and compiler (e.g., libm for the C language). All the functions are independent and do not require to be used in any particular order or combination with rare exceptions such as random generation functions that need to have a state initialized (e.g., srand and rand in the C standard library).

A framework, on the other hand, is a collection of coupled functions and data-structures. Coupled means that the order and the way those functions are called one after the other and in conjunction to each other does matter and defines the way the system behaves. A framework needs the client code to be built around the framework itself and, in a certain sense, the framework shapes the application built using it. In essence — and to summarize — a framework defines a "language" used to program a particular class of applications. This idea of framework as a language has been interpreted literally in cases like OpenGL which defines two functions: glBegin(…) and glEnd(…) to enclose the various drawing operations and it also exposes a stack — through push and pop operations — to be used to save and restore the transformation matrixes in the application.

Since I am going to discuss the "I" in API, how can I shape an API's interface deciding which functions should be exposed and how should they be composed?

How to define an API?

As far as I know, and experienced myself, there are two ways to give a shape to an API or, better, to its interface.

The first one is very simple: if you are creating an API for some well-known domain, stick to the way that domain is usually represented or known to the users of the framework. This is not only related to the programmer users but, mainly, to the users of such a domain. For instance, if you are writing an API for matrix computations, there is no reasons why you should create a new way to interact with matrixes. Matrixes are well known mathematical objects composed by rows and columns of values and it is possible to apply some mathematical functions to those values, e.g., it is possible to sum or multiply matrixes. Another example is implementing some physics, chemistry, medic or biology computations: no matter how weirdly those things are usually expressed — through the eyes of a non-expert of those domains — that is exactly the level of abstraction that the users of such an API will expect. A classic example of this is the following one (it is C++ and it is taken from Stroustrup's page):

void Decompose (double delt , SymTensor& V , Tensor& R , const Tensor& L) {
	SymTensor D = Sym(L);
	AntiTensor W = Anti(L);
	Vector z = Dual(V*D);
	Vector omega = Dual(W) - 2.0 *Inverse (V - Tr(V)*One)*z ;
	AntiTensor Omega = 0.5 *Dual(omega);
	R = Inverse(One - 0.5 *delt * Omega) * (One +0.5 *delt *Omega)*R ;
	V += delt * Sym(L * V - V * Omega);

That, while it is almost incomprehensible to most developers or casual users, is "obvious" to any physicist who knows what polar decomposition is — a reasonable definition of "obvious" applies in this case.

This way of structuring an API applies very well not only to frameworks but to simple libraries collecting disjoined functions as well. Advanced mathematics libraries such as BLAS are a very good example or this way of doing things.

The second one is the real contribution of this write-up: if you are creating a library or framework for some new domain or you want to create a new — hopefully better — abstraction over a known domain such as a I/O, threading, graphics or networking domains, the right point to start from is not the problem itself but from the way you would like to use such an API.

When I need to write a library or framework to isolate and abstract some concept, I do start writing the code that will use the yet-to-be library. Of course, this code has absolutely no chance to compile or run before the framework is done — and it is not even assured that it will when done since requirements can change on the run — but it actually shifts the focus of the API design from the problem domain to the solution one. Notice that while this - potentially — simplifies the way the framework is used, it does not assure that the framework/library will be easy to write. But, honestly, this is the full point of creating an abstraction: you must be ready to pay an high cost once for writing the library because it must be intuitive and easy to use multiple ones. It is not cost effective if you want to use the library only once but, at that point, you probably do not need a library.

From a certain point of view this is similar to what is usually referred as Test Driven Development (TDD). In TDD, you start writing the tests for the code that you are going to implement. In this way, at each point of the development you are able to test your functionality and understand if you are right into the target or not.

The way this kind of way of writing APIs differs from TDD is that your API does not need to work exactly that way. Writing the client code before starting to write the API is actually not part of the "coding" phase of the development cycle but it is part of the "design" phase of the development cycle.

Such a fake application is not even meant to be your client application: it is only a different way to abstract the design. Instead of drawing boxes on a piece of paper or a board, we start writing some mock code to clear our mind on which the right interface should be, simulating different applicative scenarios. In a certain sense, we could define this approach an "Interface Driven Development" (I hope that the term is not taken yet! Otherwise, I am sorry for that).

When I was practicing such a technique, I found myself — over multiple mock client pieces of code — changing the interface, simplifying it and correcting it many times. Usually, once this design phase was over and I started developing the API itself I found extremely clear what should happen inside the library/framework itself.

A nice discovery was that the code using the library, since the API was designed around clear and realistic use cases tended to need less comments that any other normal code that I write using third parties libraries that are — probably — designed from the inside out. The client code used to be more fluid and it could have been read as prose.

Another discovery was that the boiler plate and repeated code was almost inexistent. This because everything was abstracted before in an iterative process of simulating the use of the API: in the very moment in which the phenomenon was spotted, it was possible to think about a solution that was able to factorize and hide that kind of code in other high level functions.

The major take out of the above technique is that it enable the writer of the API — if it applies the technique — to stop just at the right level of abstraction. Not at a too high level and not at a too low one. Just right at the one required for the job that needs to be done.

Appendix: The struggle between the Complex-API and the Baby-API

This part is more a complement to the above more than a natural conclusion to it. There are things that cannot be simplified more than enough. If an application domain is very complex, It is very difficult to be able to create a simple abstraction. Having said that, it is possible to create a two levels API.

The first level is the Complex-API. Let' pretend for a moment that we are writing an API to make HTTP requests over a network. Assuming that we want to be able to create the requests in a flexible way and we want to be able to inspect the response(s), we could end up with a HttpClient class providing a generic retrieve method that taken in input a HttpRequest object and it returns a HttpResponse one. The HttpRequest and HttpResponse classes have methods that enable to client code write and read HTTP headers, specify the HTTP verb to be used and path to be retrieved. Moreover, the HttpResponse can contain a chain of other HttpRequests and HttpResponse objects that were needed to retrieve the document due to redirects. The HttpClient class exposes methods to define the behaviour in case of redirects, what is the server to be contacted, if the connections needs to be kept alive etc. Unless a connection problem happened, an HttpResponse is always returned and the status of the operation will be contained in the HttpResponse object itself.

That — I am quite convinced since I wrote more than one of those libraries in my life — is the right level of abstraction for a very flexible library for the HTTP client-server interaction. Is that the right level of abstraction for everyone?

If flexibility is not needed or if it is not a major priority, the Complex-API can actually expose also a Baby-API part for simplified cases. The very same task above, if the headers modification or the inspections of the eventual redirects is not needed, can be done by a simpler interface: there can be a SimpleHttpClient class that exposes a method for each HTTP verb (i.e., get, post…) taking in input a string (the URL pointing to the resource that we want to retrieve) and returning a string/blob which is the content of the document itself. In case of error in retrieving the document, an exception will be thrown.

The Baby-API itself can coexist with the Complex-API. The Baby-API can — and actually, it should — be built on top of the normal API. This solution actually provides both simplicity and flexibility to the API in all the cases in which maximum control of what is going on is not needed. The Complex and the Baby APIs permit a developer to only pay for what he gets. If he want to get more flexibility he needs to pay a higher development cost, it not, he can just use the Baby-API to babble some simplified code.