Posts Tagged ‘cplusplus’

Tweaking Analog Literals (C++ humor)

Saturday, August 29th, 2009

Jeremy Friesner brought this site about analog literals to my attention. It provides the long-needed ability to represent integer constants in C++ not as numbers (like 42) but rather as 1-D, 2-D, or 3-D shapes whose length, area, or volume correspond to the number’s quantity. So for instance:

assert( ( o-------------o
          |L             \
          | L             \
          |  L             \
          |   o-------------o
          |   !             !
          !   !             !
          o   |             !
           L  |             !
            L |             !
             L|             !
              o-------------o ).volume == 
 
( o-------------o
  |             !
  !             !
  !             !
  o-------------o ).area * int(I-------------I) );

That’s great! As the inventor of Arecibo ASCII, I fully support this visual double-check with our intuitions about numbers! What if aliens are trying to read our code, but don’t know about our arbitrary choices of digits and numeric base?? This could bridge that important gap! :P

But there’s one nagging concern I have, which is that I don’t think the 1-D numeric values are very intuitive. Look at these examples from the site:

assert( I-I == 0 );
assert( I---I == 1 );
assert( I-----I == 2 );
assert( I-------I == 3 );

I’d prefer it to more consistently depict the historic concept of zero, and be less arbitrary with the “2N+1″ formula of dashes to implement value N. So why not overload dereference and multiply, and define “II” to be the constant value zero? This way you can get:

assert(II == 0);
assert(I*I == 1);
assert(I**I == 2);
assert(I***I == 3);

The implementation is relatively straightforward from the proposal. But I went ahead and wrote it, and it is complete enough to give errors when compiling invalid literal specifications:

int test1 (I); // compile error!
int test2 (*I); // compile error!
int test3 (I*); // compile error!
int test4 (*I*I); // compile error!
int test5 (I*I*); // compile error!

I hope this makes it more practical for people to apply analog literals to real-world situations! Source below…

(more…)

Cleaner API Design Using Ignorable “Hints”

Friday, August 12th, 2005

Sometimes API authors expose additional entry points to their code which exist only because of performance. For instance, look at why MakeManyWidgets() exists in the sample below:

class WidgetFactory
   {
private:
   // a really, really slow routine
   void StopProcessesSoItsSafeToMakeWidgets();
 
   // another really, really slow routine
   void RestartProcesses();
 
public:
   Widget* MakeAWidget()
      {
      StopProcessesSoItsSafeToMakeWidgets();
      Widget* w = new Widget();
      RestartProcesses();
      return w;
      }
 
   std::vector<Widget*> MakeManyWidgets(int HowMany)
      {
      std::vector<Widget*> ret;
      StopProcessesSoItsSafeToMakeWidgets();
      for (int temp = 0; temp < HowMany; temp++)
          ret.push_back(new Widget());
      RestartProcesses();
      return ret;
      }
 
   void DoSomethingElse()
      {
      assert(ProcessesRunning());
      return;
      }
   };

The extra routine was added as a mere performance convenience, since making N widgets has identical semantics to simply N successive calls to MakeAWidget(). This is presumably safer than publishing the private routines for starting and stopping processes so it’s safe to make widgets—not only might that be an implementation detail we don’t want to expose, it could be deemed too serious a problem if the client improperly matches up stop/start calls.

I’ve seen this pattern countless times, and never liked it. Routines like MakeManyWidgets() lead the clients of your API to start disrupting the control flow in their programs to try and get a performance payoff that may turn out to be irrelevant in the future. It also gives the misleading impression that there might be a semantic significance to creating a set of widgets as a batch, and will make source code written to the API a lot harder to absorb.

If I face a situation like this, I completely decouple the performance “hint” from the routines that do the work. As a rule, I also make sure that if the hint gives blatantly incorrect information, the worst that can happen is that your program is a bit slower than it would have been without the hint. To give an example of how this might work, look at HINT_MakingManyWidgets() below:

class WidgetFactory
   {
private:
   mutable int widgets_hint;
 
private:
   // a really, really slow routine
   void StopProcessesSoItsSafeToMakeWidgets();
 
   // another really, really slow routine
   void RestartProcesses();
 
public:
   void HINT_MakingManyWidgets(int HowMany) const
      {
      widgets_hint += HowMany;
      }
 
   Widget* MakeAWidget()
      {
      if (ProcessesRunning())
         StopProcessesSoItsSafeToMakeWidgets();
      Widget* w = new Widget();
      if (widgets_hint > 0)
         {
         widgets_hint--;
         RestartProcesses();
         }
      return w;
      }
 
   void DoSomethingElse()
      {
      if (!ProcessesRunning())
         {
         widgets_hint = 0;
         RestartProcesses();
         }
      }
   };

This way, developers aren’t encouraged to complicate their code up front. No one will use these HINT_ functions unless they need to—only those clients who are dissatisfied with performance will bother. You can add as many as you like, adapted specifically to suit the real use cases of important clients. And if you ever want to stop supporting a hint you merely make the function have no effect.

The worst you’ll do to your clients is slow them down, and your API and code using it will be purer and more elegant!

Assertions Parameterized by Location

Sunday, May 1st, 2005

Let’s say you are implementing a class to represent a Phone. You might want to have the precondition that you cannot hangUp() a phone if it was already hung up.

The typical way to do this is with assert() statements, so in the file phone.cpp you might write:

88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
class Phone
{
private:
   enum Status {
      Dialing,
      Connected,
      Disconnected
   };
   Status _status;
 
public:
   void hangUp()
   {
      assert(_status != Disconnected);
      /*... some code here...*/
   }
 
   /*... more functions here ...*/
};

Then let’s say in caller.cpp you are modeling someone who wants to have a “typical” phone conversation:

42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
void Alice::callBob()
{
   _phone.dial("1-617-542-5942");
   friendlyGreetings();
   tellBruceSchneierJoke();
   haveSpecialDiscussion();
   _phone.HangUp();
}
 
void Alice::haveSpecialDiscussion()
{
   std::string password = askForThePassword();
   if (password == "hostilefork") {
      describeWorldTakeover();
      laughManiacally();
   } else {
      talkAboutTheWeather();
   }
   _phone.HangUp();
}

When callBob() executes and hits line 48, the Phone’s assertion will fire because it was already hung up during haveSpecialDiscussion() (on line 60). Most implementations of assert will capture the filename and line number where the actual assertion occurs, so you’d probably get a message like:

assertion failure on line 90 of phone.cpp

There’s no flexibility to let you indicate another “place” in the source. For instance, what if you wanted the assert to identify the precise offending call to hangUp()?

To address these kinds of needs I created codeplace. One of its many uses is the location-parameterized assert…which lets you pass in a “place” to be identified when the assert triggers. So in phone.cpp you would write:

98
99
100
101
102
103
public:
   void hangUp(const codeplace& cp)
   {
      assert(_status != Disconnected, cp);
      /*... some code here ...*/
   }

Then at the call sites, you would use a special macro called HERE to create a codeplace object which you pass as a parameter. For instance, the assertion-triggering call to hangUp() in caller.cpp would look like this:

48
_phone.hangUp(HERE(false));

(Note: the forthcoming documentation for codeplace will explain what false, true, and string parameters mean for “HERE” mean…)

Now when the assertion is reported by the program, it will identify the call site. With this change to our example above, you would get something more like:

assertion failure on line 48 of caller.cpp

The location-parameterized assert is only one application of a codeplace. They are helpful abstractions for any time we need to speak about a place in our source. For instance, if we wanted to track the previous disconnection call we could have a local variable in the Phone class to save it… and the tracked<T> template even does this for you automatically. There are some other nuances of the codeplace implementation that are beyond the scope of this particular issue.

You might theorize that run-time access to the stack is the ultimate API for dealing with this sort of thing. Imagine if Phone::hangUp() could somehow obtain an object representing the call stack, and then extract whatever information it wanted about the callers. That’s a little heavy-handed, and I believe that the odds of the API being abused are so high that a “narrower” protocol agreement between callers and subroutines would need to be established for the common scenarios.

Object Lifetime as Protocol in OOP

Thursday, April 21st, 2005

Object-oriented programming is a powerful paradigm for improving software engineering. However, it is important to realize why OOP is powerful.

Some people think the power comes from inheritance, and this means that instead of having to write new code you can just “inherit” behaviors from code that has already been written. This is not patently false, but if your sole goal is to write fewer bytes of code you can attack that lots of ways without OOP. You can find repeated program structures and put them in functions, or just use shorter variable names. :)

What real OOP is about is creating entities which saliently capture specific programmer concerns (objects). Thanks to the built-in creation management of objects through constructors and destructors, you can make the lifetime of the object map to the lifetime of the concern. It’s more about “everthing in its right place” and sane APIs than it is about reducing lines of code.

So given that, what’s wrong with this picture?

{
Employee worker;
worker.DoSomeStuffAndMaybeInitializeEmployee();
if (worker.IsValid())
   {
   // since worker is valid, it's ok to call GetName()
   printf("The employee is %s\n", worker.GetName());
   worker.DoOtherStuffMaybeUninitializeEmployee();
   }
else
   {
   // calling GetName() will fail, so do something else
   worker.DoOtherStuffDefinitelyInitializeEmployee();
   }
 
// before destructor runs, worker must be uninitialized
if (worker.IsValid())
   worker.Uninitialize();
 
// should be safe to run employee destructor now!
}

In short, this is not real C++. The abstract concept of an employee is getting tangled up with the mechanics of initializing an employee, after the constructor has already run. The programmer is worried about cleaning up the object before the destructor runs, which is terrible since destructors should be safe to run at any moment. Especially if you believe in exception handling!

There are grievous examples of this in many class libraries. For example, in the Microsoft Foundation Classes (MFC) the CView class tries to inherit much of its functionality from CWindow. (A view is-a window, and hence it “inherits” much of its functionality). Yet there are copious comments in the source warning you against putting “too much code” in the constructor for a class inheriting from view. Instead, you’re supposed to put it in the OnInitialUpdate() or OnInitialize() method.

Why do these Initialize methods exist? When the view constructor runs, the window it needs should have been already created, so that relevant initialization code can be put in the constructor–however much it takes. It doesn’t do this because MFC is too caught up in the mechanics of inheritance and dismissing the importance of separation of concerns. A view constructor should be the place for initialization code, not an obligatory “post-constructor” function.

Another generally troubling aspect about the example above is that there are “modes” under which certain methods are not safe to call at run-time. This indicates the design is not protecting the programmer from a concern—just giving them a headache! (However, see my article on usage of const to see how modes on objects at compile time can actually be quite helpful.)

In short: don’t lose focus on the idea that objects in a class library are there to reduce the concerns of the programmer. That reduces client lines of code and bugs. If you instead think of how to use C++ to reduce the concerns of the class library author by saving code through inheritance, you’re probably missing the point!

Pseudo-functional programming tricks in C++

Tuesday, March 15th, 2005

Wouldn’t you be upset if you discovered that sqrt(9) was 3 on weekends and 3.7 every other day of the week? I certainly would.

Yet in languages like C++, there is nothing fundamental about the language which keeps a misguided library author from writing things like:

float sqrt(float x)
   {
   if (x == 9)
      if (today() == Saturday || today() == Sunday))
         return 3;
      else
         return 3.7;
   ...
   }

Of course, you’d hope that the authors of the standard math libraries wouldn’t do something so blatantly crazy. But you can’t really protect a module from a dependency you don’t want to arise in C++. So any function in any library can—theoretically—examine any information it wants. That information might live in a global variable, somewhere in the file system, in the video buffer, or even on the Internet.

To address this issue, computer science academics often advocate ditching traditional OOP entirely and moving to functional programming. This lets you contractually define operations which are guaranteed to depend solely on the parameters you give to them. This is great, but it requires a certain “stylized” way of thinking…and it doesn’t trust programmers to “do the right thing”. So functional programming often makes seemingly-simple procedures a hassle to write.

Yet it’s possible to abuse C++ a little, in order to be relatively sure certain methods aren’t depending on certain things. For instance, let’s imagine that you have a Region object type which is going to receive some number of Enter() and Exit() calls, followed by a call to RunAction():

class Region
   {
   ...
   virtual void Enter() = 0;
   virtual void Exit() = 0;
 
   // supply action code here, by overriding "RunAction"
   // please do not make behavior depend on how
   // many times Exit or Enter was called, only
   // depend on what the last call was!
   virtual void RunAction() = 0;
   ...
   };

Notice that you would very much like RunAction()’s effects to NOT depend on how many Enter()s or Exit()s happened. Yet the comments do little to enforce this: if any of Region’s members are modified by the Enter() or Exit() calls, that might introduce a dependency! If you want to discourage the RunAction() method from depending on those modifications to Region, you can introduce a cooperating class:

class Action
   {
   ...
   virtual void Run(bool inside) = 0;
   ...
   };
 
class Region
   {
   ...
   virtual void Enter() = 0;
   virtual void Exit() = 0;
   virtual std::auto_ptr<Action> AllocateAction() = 0 const;
   ...
   };

I’ve used a smart pointer here (std::auto_ptr), to indicate a transfer of ownership. The Action that Region::GetAction returns is owned by the caller and can be freed by it. That’s important!

Before any Enter() or Exit() calls are made, the caller asks the Region object to construct an Action object which predetermines the final action. The Action object does not have access to the state which may be accumulated by the Region object, and vice-versa. Yet Region still drives the logic—it was able to give us a class which dictates what will happen when a Action::Run() is finally invoked…but without being able to make that execution conditional on the state it might gather.

On the surface this looks good, but it can be undermined (unintentionally or otherwise) in several ways. For instance, the Region object could poke a this pointer into the Action it returns, thus giving the Action access to its state to the later call to Action::Run(). However, as long as we require (by convention) that Region::AllocateAction() produce an isomorphic object on each invocation, the caller can:

  • Construct an object region1 of type Region
  • Call region1::AllocateAction() to produce an object action of type Action
  • Construct a new object region2 of type Region (same constructor as step 1)
  • make the Enter() and Exit() calls on region2
  • optional: make a few spurious Enter() and Exit() calls on region1
  • Destroy region1
  • Call action::Run(inside) where inside indicates whether we are inside region2

This obviously hampers the ability of Region developers to build a coupling between the Region object and the Action object it hands you. It’s not exactly functional programming, and of course a programmer could subvert any hard-coded strategy you might come up with (it’s even better if you do something more random). Yet it should catch most major violations of the model, and alert a programmer who is simply unaware of the dependency that they were doing something wrong.


Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported