Tuesday, August 20, 2013

The C programmer dilemma: To Macro or to Function?

It is one of the frequent question a C programmer faces when coding: should macros be used, when, how and per what priority.

The best answer is probably in actual usage. Personally, I consider the Linux kernel as an aggregation of experience, being most insightful in answering many coding principles.

It seems appropriate first to outline the types of macros in the C language:

  • Object-like , that provide symbolic naming (aliasing), usually of constants.
  • Function-like, that perform an operation on one or more inputs.

Object macros are used to describe constants in a more descriptive manner, making the code more readable. They may also be used as compiler predefined macros that describe at compile time the line number, function name, file name and even the compiler capabilities.
Function-like macros usage range from just simple wrappers to full function alternative, including ones that are impossible to implement through conventional functions.

The Macro comes last
When trying to define a default rule for prioritizing between macros and functions, I usually reach the following order:
  1. Function.
  2. Inline Function. (be careful)
  3. Macro.


This will fit most common usage with a simple reasoning: functions are safer, they are evaluated in both compilation and linkage, easier to debug, do not increase code footprint and easier to scale.
Considering modern compiler optimizations and the inline option, this fits well even when considering CPU and memory constraints.

Only Macro 
Nevertheless, there are still occasions where macros are very useful, and functions, of any type, cannot be an alternative. Usually, these macros either are generic enough to ignore the input types or accept as input the type on which the work is to be done.

A simple example is the min() macro which is consider "safe":
#define min(x, y) ({ \
typeof(x) _min1 = (x); \
typeof(y) _min2 = (y); \
(void) (&_min1 == &_min2); \
_min1 < _min2 ? _min1 : _min2; })
It's safe because:
  1. It is usable in expression context. (allowing: if(x) min(a,b); )
  2. Only evaluates its arguments once. (supporting: min(a++,b); )
  3. Requires that both of its arguments are of the same type (i.e. no implicit casts).
A non trivial example is the list_entry() macro which is frequently used in the Linux kernel to handle (double) link lists. The link list are handled using node anchors embedded in the data structures to generalize the link list handling. (check here for more details)
In fact, list_entry() is a wrapper for the container_of() macro:
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
Its objective is simple: Given a node pointer, the data type in which it lives and the data type member name that represents the node, return the pointer to the data structure. Note that the node may be placed anywhere in the structure.
Here is a simple example of how it is used:
struct my_data_s
{
   int                    id;
   struct list_head node;
   char               *name;
struct my_data_s test_data[10];
struct my_data_s *tmp_data;
struct list_head *p_node = &test_data[5].node;

tmp_data = container_of(p_node, struct my_data_s, node);
/* tmp_data == &test_data[5] */ 
Obviously, such a macros cannot be converted in any way to a function, and at the same time, it presents a valuable functionality of generalizing usage without loosing "safety".

Creator Macro
Another interesting usage of macros is the static creation & initialization of objects.
One such example, also in the list module is LIST_HEAD(), which creates a link list and initializes it. In some cases, the modules provide both static and dynamic (*alloc) object/instance creation, where the macro is used for creating the static ones.

Wrapper Macro
Wrappers can actually be implemented as functions, however, many times they are so basic that it seems wasteful to use a call function. Other times, it is required to control such operation at compile time (and not run-time).
Usage includes mocking of functions in unit tests, expanding an existing function (adding arguments) without touching the callers, switching between debug and production code through compilation flags (see the assert() macro), etc.



Summary
We should prefer to "function", "inline" if necessary and "macro" when appropriate, in this order.
Reasoning the use of each function-like macro is a good start.



(and when such a reason is found, please share so I could add to my list :-))

Thursday, August 1, 2013

Why should we rain down builds and deliver without restraints.

What I like about software development is the diversity of subjects I have the opportunity to explore. What I like even more, is that my opinion and the way I see things constantly changes.



Strange as it may seems, in my domain (embedded, C/C++ based, telecom industry) Continuous Integration is not a familiar term or practice.
From my limited experience in the field, practices and tools that support CI for Embedded C/C++ programming are rare.

For many (in our domain), the idea of every check-in to a release branch becoming a deliverable build is insane.

Designing and coding for a few weeks, followed by a few weeks of QA tests is very common. There is a clear separation between the development stage and the validation one, starting from personnel physical location, through the purpose of the team and the measurement of its productivity & effectiveness.
Project managers support the separation by bridging gaps and scheduling work between the DEV, QA and other groups. The legacy attempt to plan everything in advance, bring PM/s to schedule delivery of builds and their content early on, even though practice teach us that rarely a team can keep up with the original planning. The practice is repeated and optimized with time by building more and more graphs, excels and tools to make the estimations and plans more accurate.
By the nature of this planning, the development team will look for the most suitable build rate delivery, taking into account the overhead of aggregating all changes (of all components) in a build, testing each, preparing documentations, etc.

Coming back to the purpose of this post, here is a proposed change from the ground up: Embedded or not, SW is SW, if CI works so well in the industry, embrace it and try it out.

Yes, create builds on every check-in, deliver them by making them available to QA (or support) and open up the borders between developers and testers by working closer together.

Trying to depict some advantages of frequent build delivery, I came up with these few points:

  • Response time between detecting and resolving bugs is shorten extensively:
    • From the tester perspective, he may receive a build immediately after the programmer checked-in the fix. This keeps the tester in the same scope (subject, setup, etc...) which should reduce the validation time.
    • From the programmer perspective, the feedback on his work is closer to the actual change and therefore easier and faster to resolve issues.
  • As a nice side effect, check-in quality (to release branches) increases due to the simple fact that it is also a delivery. Breaking builds is no longer a private failure that affects the productivity of the college programmers, it has a larger effect that is not acceptable and therefore it  must be resolved from the roots.
  • Interaction between DEV & QA increases, they communicate more and allow testers be involved more in the development process and affect it.
  • Opens up areas to improve the process at low levels, from automation to programming practices which shorten the feedback loop and increases quality.


So lets try it out: Build without restraints, encourage testers to use the builds and validate the changes.

Obviously, nothing is for free, and there are points which need treatment before proceeding in this path:
  • Version format needs to support a large number of builds.
  • CI tools need to be integrated, allowing automated build creation, making build creation "cheap".
  • Introduce automated Unit Tests and Integration Tests to increase/assure quality and reduce the overhead of creating builds (in relation to tests).

Feedback and other ideas are welcome, especially if you do not agree. :)