Cleaner API Design Using Ignorable “Hints”
Sometimes API authors expose additional entry points to their code which exist only because of performance. For instance, look at why MakeManyWidgets() exists in the sample below:
class WidgetFactory { private: // a really, really slow routine void StopProcessesSoItsSafeToMakeWidgets(); // another really, really slow routine void RestartProcesses(); public: Widget* MakeAWidget() { StopProcessesSoItsSafeToMakeWidgets(); Widget* w = new Widget(); RestartProcesses(); return w; } std::vector<Widget*> MakeManyWidgets(int HowMany) { std::vector<Widget*> ret; StopProcessesSoItsSafeToMakeWidgets(); for (int temp = 0; temp < HowMany; temp++) ret.push_back(new Widget()); RestartProcesses(); return ret; } void DoSomethingElse() { assert(ProcessesRunning()); return; } };
The extra routine was added as a mere performance convenience, since making N widgets has identical semantics to simply N successive calls to MakeAWidget(). This is presumably safer than publishing the private routines for starting and stopping processes so it’s safe to make widgets—not only might that be an implementation detail we don’t want to expose, it could be deemed too serious a problem if the client improperly matches up stop/start calls.
I’ve seen this pattern countless times, and never liked it. Routines like MakeManyWidgets() lead the clients of your API to start disrupting the control flow in their programs to try and get a performance payoff that may turn out to be irrelevant in the future. It also gives the misleading impression that there might be a semantic significance to creating a set of widgets as a batch, and will make source code written to the API a lot harder to absorb.
If I face a situation like this, I completely decouple the performance “hint” from the routines that do the work. As a rule, I also make sure that if the hint gives blatantly incorrect information, the worst that can happen is that your program is a bit slower than it would have been without the hint. To give an example of how this might work, look at HINT_MakingManyWidgets() below:
class WidgetFactory { private: mutable int widgets_hint; private: // a really, really slow routine void StopProcessesSoItsSafeToMakeWidgets(); // another really, really slow routine void RestartProcesses(); public: void HINT_MakingManyWidgets(int HowMany) const { widgets_hint += HowMany; } Widget* MakeAWidget() { if (ProcessesRunning()) StopProcessesSoItsSafeToMakeWidgets(); Widget* w = new Widget(); if (widgets_hint > 0) { widgets_hint--; RestartProcesses(); } return w; } void DoSomethingElse() { if (!ProcessesRunning()) { widgets_hint = 0; RestartProcesses(); } } };
This way, developers aren’t encouraged to complicate their code up front. No one will use these HINT_ functions unless they need to—only those clients who are dissatisfied with performance will bother. You can add as many as you like, adapted specifically to suit the real use cases of important clients. And if you ever want to stop supporting a hint you merely make the function have no effect.
The worst you’ll do to your clients is slow them down, and your API and code using it will be purer and more elegant!

January 2nd, 2009 at 3:08 pm
Hi Brian,
I do something similar, but perhaps a little bit better. My version of the API would look like this:
[code]
[…]
public:
void BeginMakingWidgetsBatch()
{
widgets_batch_count++;
}
Widget * MakeAWidget()
{
if (ProcessesRunning())
StopProcessesSoItsSafeToMakeWidgets();
Widget * w = new Widget();
if (widgets_batch_count == 0)
RestartProcesses();
return w;
}
void EndMakingWidgetsBatch()
{
if (–widgets_batch_count == 0)
RestartProcesses();
}
[/code]
I think this is better, because it doesn’t force the caller to try and predict in advance how many widgets he is planning to make. If he wants “better performance mode”, he simply does this:
[code]
BeginMakingWidgetsBatch();
// however many calls to MakeAWidget() he cares to do, go here
EndMakingWidgetsBatch();
[/code]
… and it handles nested calls correctly as well. The processes are always restored at the last call to EndMakingWidgetsBatch(). And of course the non-batch version does the right thing as well.
The only hazard here would be the possibility that the user will call BeginMakingWidgetsBatch() and forget to call a matching EndMakingWidgetsBatch(), in which case your processes would never get restarted. So if you wanted to be extra safe/fancy, you could preclude that possibility by putting those calls into the constructor and destructor of an object that the user puts on the stack, instead:
[code]
WidgetFactoryBatchObject batchMe(&theWidgetFactory);
// any number of calls to MakeWidget() can go here
[/code]
(Making the code exception-safe is left as an exercise to the reader ;^) )
-Jeremy
January 2nd, 2009 at 3:11 pm
Oops, that should be:
[code]
void EndMakingWidgetsBatch()
{
if (-–widgets_batch_count == 0)
RestartProcesses();
}
[/code]
January 8th, 2009 at 4:19 pm
Hi Jeremy, thanks for visiting the blog (and reading these old entries!)
You make a good point in matching problem-statement to problem-solution for the specific case I gave. Since I wasn’t taking advantage of any particular property of knowing the number in advance, it was a distracting choice. So I should either use your hint or introduce the allocation number benefit (like, there’s some memory block and knowing the number helps you allocate one block). Since the latter just complicates no reason I’ll go with yours!
But my main thrust wasn’t any particular prescription for what kind of hints you would make. It was merely the idea of splitting exposed APIs into instructions for which every parameter has semantic meaning… and these “hints” in which ZERO parameters have meaning.
It encourages clients to write their code in a natural way–and gives the API implementers a chance to be more free in inventing interesting hints based on the specific use cases that clients are encountering. Especially since the program should still work if the implementation were changed to a no-op! Not only will the hinting not break backwards binary compatibility when clients run against old API implementations (assuming the hints aren’t statically linked), but a hint that turns out to be useless can be dropped.
I’ve always wished it were possible to give these kinds of hints to hardware. Recently I was looking at LLVM and saw they had an instruction for “prefetch” where you can give hints about the locality of your data access:
http://llvm.org/docs/LangRef.html
That looks like a pretty cool project overall, perhaps worth delving into…