Sunday, May 2, 2010

HTTP client made easy by using the asio library

Recently I had another chance to code a couple of HTTP clients, making use of the asio library as the foundation. This time I refined the approach that I had taken a couple of years ago, by componentizing things as follows.
  • HTTP client hierarchy
  • Request classes
  • Response-handling classes
The client hierarchy consists of a base class that contains the primary logic flow common to both the ordinary (http://...) and secure (https://...) HTTP calls, for each of which a derived class is designed. The base and derived classes are tied together nicely by taking advantage of the template method pattern.

With the aid of asio, it takes just a few lines to code the core logic flow of the non-secure client, roughly as follows.
asio::ip::tcp::iostream s(serverAddress, serverPort);
HttpRequest req(...);
s << req << std::flush;
HttpResponse resp;
s >> resp;
It's a little more involved for the secure client, but still it requires just about a dozen or so lines of code.

What is left to do is to design and code the request and response classes specific to the particular service and/or operation offered on the server side. A request must be constructed as required by the server, and the corresponding response certainly varies by the service/operation requested/invoked. For a set of related or similar services/operations, it's not hard to imagine that hierarchies of requests and responses may be built, at least for code reuse.

The asio library started off stand alone, and it was header-only as of at least version 1.2, which is very nice, though I'm not sure about the latest. The more recent releases of the boost libraries include asio (brought in as a sub-namespace), but my understanding is that it may no longer be used header-only as part of boost, likely due to some decisions made at the time of integration.

Sunday, March 28, 2010

Avoid copying objects held in shared_ptrs carried in a container

Well, as a general rule of thumb, one should (must) not define and use std::vector<std::auto_ptr<T> >; instead, std::vector<std::shared_ptr<T> > is recommended. But what if you must hand over the ownership of all the objects in the container. For instance, you must past each object to a thread-safe queue through the following interface (std:: omitted for brevity henceforth):

void producer_consumer_que<T>::put(auto_ptr<T> task);

Now, the context in which the subject popped up is as follows.

I'd like to go through a data structure and make/create task objects to be passed along to the queue above. On the other end of the queue, there are a number of consumer threads that get the tasks one at a time and process them.

However, the source data structure is a bit complex and thus must be locked, at least partially during the process in which the tasks are made/collected. As the size of the queue is limited, if the consumer threads could not keep up with the producer, the source data structure could be locked far too long to allow other, higher priority accesses.

A solution is to divide the task collection and en-queuing into two stages. That is, first making and placing the task objects in a container (e.g., vector) and releasing the locks on the source data structure quickly; and then en-queuing the task objects at leisure outside the locked scope.

This comes around to the difficulty stated at the very beginning of this post. If the task objects are in a vector<shared_ptr<T> >, there is no way to release the objects, each held by a shared_ptr. A copy must be made and held in an auto_ptr, which may, in turn, be passed along to the queue.

I came up with the following solution in my sleep, to avoid the extraneous copying. The container may be defined as

typedef vector<shared_ptr<auto_ptr<T> > >    container;

The task objects can then be made and added to the collection as follows.

container coll;
auto_ptr<T> t0(new T(...));
auto_ptr<auto_ptr<T> > t1(new auto_ptr<T>(t0.get()));
t0.release();
shared_ptr<auto_ptr<T> > t2(t1.get());
t1.release();
coll.push_pack(t2);


To guarantee the ultimate exception safety, several local variables are declared and used above. It's possible to reduce use of local variables without sacrificing robustness, e.g.,

container coll;
auto_ptr<auto_ptr<T> > t1(new auto_ptr<T>());
t1->reset(new T(...));
shared_ptr<auto_ptr<T> > t2(t1.get());
t1.release();
coll.push_pack(t2);


Without worrying about exception safety, it could be as simple as

container coll;
coll.push_pack(shared_ptr<auto_ptr<T> >(new auto_ptr<T>(new T(...))));


After adding all the task objects of type T to the container and getting out of the scope that locks the source data structure, we use a loop to transfer the objects from the vector to the queue (object que) as follows.

for (container::iterator i=coll.begin(), e=coll.end(); i!=e; ++i)
    que.put(**i);


The iterator i is first dereferenced to the shared_ptr object, whereas the second * (the left-most one) causes operator*() of the shared_ptr to be invoked, resulting in the auto_ptr<T> object, rather than a pointer to it.

In summary, a task object of type T is created once and ended up being passed to the producer-consumer queue with ownership transfer (to the queue). (As a matters of fact, it'll only be destroyed eventually by the consumer thread that gets to carry the task out.)