 |
Department of Engineering |
 |
 |
C++: Visibility and Look-up (Draft)
This article deals with some situations where entities eclipse
other entities with the same name. It starts with a review of some
standard C++ mechanisms then presents some problems that arise
when combining them. It's not for beginners - unless you're serious about
C++ you don't need to worry about meeting these problems or trying to
understand them.
Consider this little program.
int i;
int main() {
int i;
i=9;
}
There are two is - one inside main and one outside.
Which i is set to 9? By default the "nearest" one is, the one
inside main (we'll come back to what is meant by "nearest" later
in this article). The other i isn't visible (it's covered
up by main's i) but it's "in scope" so it can be accessed.
C++'s scope resolution operator (:: ) is used when specifying
where to look for a variable in such circumstances. In this case the scope to
be searched is the one that encloses the current one. Using ::i
accesses the variable there.
A namespace defines a scope. It's like a context which determines the
meaning of a symbol. Just as the
meaning of "Cambridge" will change depending on whether you're in a UK or
US context, so the variable accessed in a C++ program by the symbol "i" will
depend on the context, as we've seen above.
Some simple languages only have one namespace - all function names, variable names, etc belong to the same context. Some other languages have several independent namespaces (one for variable names, one for function names, etc) making it possible to have both a variable and function with the same name, but the number and role of these namespaces are fixed.
C++ has some fixed namespaces, but it also has named namespaces and lets users create new namespaces. It also offers control over which of these named namespaces will be used when the meaning of a symbol is required.
In ANSI C++ the standard library facilities (like cout, string, etc) are kept inside the namespace called std, which by default isn't consulted.
Namespaces are created and entities put into them by using namespace.
E.g.
namespace test {
int i;
};
creates a namespace called test (if one hasn't already been created)
and puts i into it.
Then test::i (the same notation that you'd use were test an object) will access the i variable. The command using namespace test will make available all the things inside test so that
test:: isn't necessary. Let's see this in action
namespace test {
int i;
};
int i;
int main() {
i=9;
}
In this program main can only see one i, the other is hidden inside the test namespace. The latter is in scope and
can be accessed using test::i. What about the following though?
namespace test {
int i;
};
using namespace test; //this line's been added
int i;
int main() {
i=9;
}
Here both is are visible from main. In fact they clash so
this program won't compile. My compiler says The declarations "int i"
and "int test::i" are both visible and neither is preferred under the name lookup
rules.
Functions can be in a class or free standing. Whenever a function call is
processed there may be several available functions in scope with the same
name. C++ performs a look-up using a well-defined strategy in order
to decide which function to call. Sometimes it can't decide which function
is best to call, in which case the compiler complains about
an "ambiguous call". Usually there are no problems - the programmer and
compiler agree on the best option.
In the following for example,
there's a free-standing fun(float f) as well as one in the class.
The fun(f) call in main calls the free-standing one. The call
from inside the class calls the "nearer" one inside the class.
void fun(float f){};
class classy {
public:
void fun(float f){};
void fun2(float f){fun(f);};
};
int main() {
float f;
fun(f);
classy c;
c.fun2(f);
}
Functions add a complication because it's possible
to have many functions with the same name all visible without clashing
as long as they take different arguments. In the following example
fun is overloaded - which isn't a problem!
void fun(int i) {};
void fun(int i, int j) {};
int main () {
fun(3);
fun(5,7);
}
The following's perhaps a little trickier.
void fun(float f) {};
int main () {
int i=3;
fun(i);
}
There's no function called fun that takes an integer so fun(float f) is called without complaint. In the next example fun(int) is supplied, so
this will be the prefered candidate.
void fun(float f) {};
void fun(int f) {}; // added line
int main () {
int i=3;
fun(i);
}
None of that should be too disturbing, but what about the following?
Which f function is called? The first might look like the closest
match, but it's the second that's called, because the char* to bool conversion is built into C/C++, and matches using standard conversions
take precedence over user-defined ones.
#include <iostream>
#include <string>
using namespace std;
void f(string a, string b, bool c = false) {
cout << "called 3 arg function" << endl;
};
void f(string a, bool c = false) {
cout << "called 2 arg function" << endl;
};
int main()
{
f("one", "two");
}
Often you'll need to add extra functionality to an existing class. C++ provides a mechanism to build new classes from old ones
class Base {
public:
int value1;
};
class More : public Base {
public:
int value2;
};
int main() {
Base b;
b.value1=7;
More m;
m.value1=7;
m.value2=9;
}
Here More inherits the members of Base so m
has 2 members - value1 and value2. Members can be functions or variables. The following, which uses functions where the previous
example used variables, works ok.
class Base {
public:
void fun1(){};
};
class More : public Base {
public:
void fun2(){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun2();
}
Now we come to our first "interesting program". Suppose we give both functions the same
name but different arguments. What happens?
class Base {
public:
void fun1(){};
};
class More : public Base {
public:
void fun1(int i){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun1(5);
}
b.fun1() poses no problem. One might expect m.fun1()
to call the Base's function and m.fun1(5) to call
More's function (i.e. expect fun1 to be overloaded).
In fact
the code doesn't compile - void fun1() is masked by void fun1(int). With an extra line it will compile
class Base {
public:
void fun1(){};
};
class More : public Base {
public:
using Base::fun1; // added line
void fun1(int i){};
};
int main() {
Base b;
b.fun1();
More m;
m.fun1();
m.fun1(5);
}
And here's another surprising situation. The following compiles, but why?
namespace test {
class T {};
void f(T){};
};
test::T parm;
int main() {
f(parm); // OK: calls test::f
}
Here we have a namespace called test inside which there's a
class T and a function that takes one argument of type T.
Outside the namespace a variable called parm is created. Note
that the test:: is needed to get hold of the T within
this namespace. In main a function f is called. Even though
there's no test:: before the function name, and no previous using namespace test line, the program compiles.
This is a situation where
Koenig lookup (also called Argument-Dependent name Lookup - ADL) is used. If you supply a function argument that isn't a built-in type (here parm, of type test::T), then to find the function name the compiler is required to look in the namespace (in this case test) that contains the argument's type as well as in the usual places.
Ordinary name look-up searches for qualified names in the nearest enclosing scope where the name is used, and if not found, the look-up proceeds in successively enclosing scope until the name is found. Even if the name is not appropriate
for the given use, the look-up search proceeds no further through the hierarchies. At this point ADL finishes the job.
This explains why the example in the previous section failed whereas the one in
this section succeeded, but the look-up mechanism seems to be defeating one of the purposes of namespaces -
the ability to hide entities. However, there's a case for saying that once
T is brought out into the open, then associated routines should
become visible too. There are also pragmatic and safety reasons why
ADL is used.
- Here's a simple program
#include <iostream>
#include <string>
int main() {
std::string hello = "Hello, world";
std::cout << hello;
}
This is analogous to the previous program: std::string is like the
test::T of the earlier example. operator<< is a free function
that the compiler can only find using ADL (operator<< can't be a member function because it requires a stream as the left-hand argument).
Without ADL the final
line would be awkward to express.
- Here's another simple fragment
char x;
void f() {
int x;
x = 'a';
}
C/C++ has always set the function's x in this situation although
the other x is a closer match type-wise. ADL conforms with this traditional
behaviour.
Here's a situation involving classes. In this fragment, the g
function calls the class's f routine.
class X {
int f(int);
int g() { f('a'); }
}
But suppose that during program development a global function
f(char) were added - what f function should
g call then? It would be an unpleasant shock if the global function
were called - you don't want the internals of classes to be
quite so vulnerable to external changes.
A final note from Victor Bazarov on comp.lang.c++ -
ADL applies only to function names, not variables. The
only other thing that has arguments in C++ is templates. But ADL doesn't
apply to them. In this example
namespace test {
enum foo { f };
template<foo f> class bar {};
}
int main() {
bar<test::f> barf;
}
main's bar isn't going to be looked up in
test even though
its argument is fully qualified and found in the test namespace.
Most of the material above derived from the comp.lang.c++ newsgroup
and the following sources.