In my last blog post I've ranted about problems with implementing intrusive containers in C++.
The main problem is that to allow an object to be included into an intrusive container you have to modify the object itself, for example, add 'prev' and 'next' member variables to it. That in turn breaks the encapsulation principle. In rigorous object-oriented design the item is owned by the container, and thus the container should be aware of the item but not vice versa.
There have been a lot of discussion about the blog post both on this site and elsewhere but, unfortunately, participants focused on unrelated issues like whether list is an efficient data structure or whether objects as such rather than pointers should be stored in the non-intrusive list. The encapsulation issue went almost unnoticed and nobody have proposed any solution for it.
While I believe the encapsulation issue is a problem of design of C++ language and thus cannot be solved in C++, there seems to be a way to mitigate it. You cannot avoid adding 'prev' and 'next' pointers to the contained item, however you can prohibit the item to access these members and you can make them visible only to the container. To do that you can use a strange technique that I haven't seen being used so far, so I am taking liberty to call it "enclave pattern".
According to Wikipedia "An enclave is a territory entirely surrounded by another territory."
In the picture above, C is an eclave of A inside the teritory of B. Well-know real-world enclaves are, for example, Kaliningrad or Nagorno-Karabakh.
In the programming context we can create an enclave C of class A inside class B. The idea is that while instance of C is embedded inside of instance of B, its members are not accessible by B, only by A:
class A
{
public:
class C
{
friend class ::A;
private:
int i;
};
void fx ();
};
class B
{
public:
A::C c;
void fx ();
};
The unusual thing about the code snippet above is that class C is entirely private. It has no public members nor methods. In theory, it should be unusable. However, we are using 'friend' keyword to grant class A access to it.
Now, let's test the encapsulation:
void A::fx ()
{
B b;
// Following assignment compiles OK.
b.c.i = 0;
}
void B::fx ()
{
// Following assignment fails to compile:
// error: ‘int A::C::i’ is private
c.i = 0;
}
As can be seen, enclave C exists inide of B, however, it can be accessed only from the methods of A.
And here's an example of implementing an intrusive list using the enclave pattern:
class list
{
public:
class helper
{
friend class ::list;
private:
helper *prev;
helper *next;
};
void push_back (node *item)
{
mtx.lock ();
item->next = 0;
item->prev = last;
last = item;
mtx.unlock ();
}
private:
mutex mtx;
helper *first;
helper *last;
};
// This class can be contained in the list.
class person
{
public:
list::helper h;
int age;
int weight;
};
EDIT: The following example was added to the article post facto.
I've deliberately made the list implementation thread-safe to make the value of enclave pattern more obvious. The 'list' object contains a mutex that guards all the data fields belonging to the list. That is 'first' and 'last' members of the 'list' itself as well as 'prev' and 'next' fields that are the enclave of 'list' class inside 'person' class.
Now, imagine that implementer of 'person' class would accidentally mess with the 'next' pointer. As 'person' class exists outside of the list's critical section, it would lead to race conditions and undefined behaviour:
class person
{
public:
list::helper h;
int age;
int weight;
void be_evil ()
{
h.next = 0;
}
};
Fortunately, the 'enclave' pattern saves the day here. When trying to compile the above implementation of 'person' the compiler will report an error:
error: ‘list::node *list::node::next’ is private
It is entirely possible that this pattern was already described elsewhere and I believe it is new only thanks to sheer ignorance. If so, let me know and I'll add the appropriate citation to this article.
Martin Sústrik, September 11th, 2012
Nested objects or builder pattern - thats a regular name fo rthis, and use extensively in OOP design, when you dont want to expose object but want still gives way to query it.
In case you want also to modify (by aplying or modifying behaviour) in composoable and safe way you use Aplicative functors pattern (monoidal preferably).
Wikipedia's description of builder pattern seems to be a completely different thing:
https://en.wikipedia.org/wiki/Builder_pattern#C.2B.2B
well , depends how you apply it, and common ways to apply this pattern depends on type system. For example Java vs JS build pattern application.
You can apply it in a way of creating nested object or in a way of creating new aggregated instance.
For languages with monomorphic type system its common to apply builder pattern to create aggregated instances.
Otherwise common pattern is to create (using builder pattern or just simply return nested object) on request a contract for some underlying (polymorphic) data structure, without breaking encapsulation of that structure. This way you can return for specific request (in case of builder pattern) a contract - immutable, with possibly wired events to underlying storage and AP pattern to morph it functionality (and safely compose it), for example to dynamicly extend contract for new query type (this can also be hardwired into builer as well but with losses).
Really its all common topics in any OOP CS.
Hm. I still don't see where a data member of one class is exclusively owned by another class, which is what the blog post was about.
msdn.microsoft.com /en-US/library/8b1xzat2(v=vs.100)
Simple iterator pattern.
Based on nested class pattern. msdn.microsoft.com /en-US/library/ms173120(v=vs.110)
Nested class used extensively in java as poor man closure.
Gives you access to parent class members without exposing parent class innards to the world.
Builder pattern to return nested class or aggregate as a contract appropriately constructed with specific semantics.
If you want exclusivity(?) i assume you mean here immutability of items - return nested class or ask builder to make you a contract with Enumerable(iterator) to underlying storage, which do new instance(copy) of each returned item.
In case you refer to exclusivity in a sense of immutability of underlying storage - i.e. after generating contract not to observe changes in underlying storage (new item added for example due to laziness of Iterator will be still observed) - copy underlying storage and evaluate it eagerly.
In case you refer to exclusivity in access semantics - that’s exactly why we want to have contract.
The only purpose of builder pattern is to generate the contract with exact semantics caller requested. (access semantics, items semantics (mutability), lazy, etc)
I still don't follow.
According to wikipedia: "The intent of the Builder design pattern is to separate the construction of a complex object from its representation. By doing so, the same construction process can create different representations."
However, the construction is not being discussed here.
What's discussed is a data member of class A, which cannot be accessed by methods of class A, only by methods of unrelated class B.
ok, no builder, though i fail to see how static and hardcoded direct access any usefull.
Take a look then on nested class and closure.
The article explicitly mentions the motivation: Implementing intrusive containers without breaking encapsulation.
Nested class is used in the example, but it's used only to make it nicer. Everything would work in the same way even in C was declared as a top-level class.
Closures deal with storing a snapshot of the environment, they don't seem to have much in common with the topic discussed here.
It seems that what you propose is a way to implement the listable functionality as a kind of mixin (or trait). This is usually done in C++ as a template class which inherits from the template parameter. If I remember correclty, user Helio has shown this as an example in the discussion of your last blog.
Yup. Helio's solution is similar to the one above in that the contained class can't access the prev & next members. However, there are two problems with it:
1. How to embed the object into multiple lists?
2. How to ensure that prev & next members are accessible only by list class and not by anyone holding pointer to the object.
Problem 1 is still present in your solution, since you have to modify the class ‘B` every time you want to add it to another list. Not that it were something bad. But you’ve mentioned this yourself in your previous blog post as an argument against C++'s way, and directed it upon other people's attempts at solution, so you should respect your own standards here too ;-) According to this standard, this solution is still not the ideal. And making it ideal by enforcing 0 modifications to the ‘B` class, or even no recompilation required, could be difficult to achieve, if not impossible. Because, as you’ve said yourselves, you need to put those ‘prev` and `next` somewhere this way or another. And every time it would have to be a part of `B`’s objects' layout in memory, whenever this layout will change, `B` and its users would have to recompile at the very least, I guess.
Yes, it's a pretty lame solution. I've posted the article only because I've never seen this pattern in the wild. I thought it would make for interesting reading and possibly make people think of limitations inherent in existing languages.
Hi,
my main point was that the solution you propose is another way to implement a mixin (or a trait), so I don't think it is a new pattern.
1. I don't see a difference here to your solution. With template mixin, I need one mixin class for every list, while your solution needs a new helper class for every list. Both don't scale well, but the mixin have the advantage that I don't have to modify the person class every time. This is nice when it comes to testability and single responsibility.
For both solutions, you need a different list type for every list the object will be contained in.
2. Same solution as in your proposal: Make the members private and declare the list as friend of the mixin:
class Personlist;
template <class contents, typename List> class node: public contents
{
friend List;
node<contents, List> *prev;
node<contents, List> *next;
};
class person
{
int age;
int weight;
};
typedef node<person, Personlist> person_node;
class Personlist
{
public:
void add(node<person, Personlist>& n)
{
n.prev = &last;
n.next = nullptr;
last.next = &n;
last = n;
}
private:
node<person, Personlist> first;
node<person, Personlist> last;
};
I am just a C/C++ programmer and I have no experience with mixins, so please bear with me…
Looking at the wikipedia article about mixins, it seems that mixins are bundles of functionality that you can add to your class.
However, in the example in the article there's no functionality involved. Making C a member of B has zero impact on the functionality of B. It adds no functionality that can be used from within B and no additional functionality for the owners of objects of class B.
What really happens is that memory for couple of member variables of unrelated class A is allocated with instance of B instead of with instance of A.
The 'enclave' pattern redefines how boundaries of objects are perceived. In plain C++ class is just a glorified struct. It's a continuous piece of memory with some functions attached.
That, however, doesn't fit well (at least mine) mental concept of objects. For example, I want the object to be "atomic" in the sense that if I enclose all of its members into a critical section, I won't get any data races and undefined behavior.
So, in the last example in the article. 'list' class contains members 'prev', 'next', 'first' and 'last'. The latter two are standard members, both in terms of visibility and memory location. The former two have the same visibility as normal members, however, memory-wise they happen to be located elsewhere (alongside the instances of class 'person').
I am not sure whether this kind of thing quialifies as a mixin. Wikipedia doesn't seem to suggest so.
Well, it adds the functionality to be part of a list to B, so in that way it is a mixin. Of course, you chose to implement the functionality in a way which is kind of non-oop because it does not talk about behavior or actions but about some members of a "glorified struct". In a real OOP design, I am not interested in data members (and surely not in data layout), but in the interface and behavior of objects. I think you should focus less on data layout and data modelling but on interfaces and behavior of things if you want to have an object-oriented design.
By the way, I don't understand "That, however, doesn't fit well (at least mine) mental concept of objects. For example, I want the object to be "atomic" in the sense that if I enclose all of its members into a critical section, I won't get any data races and undefined behavior." What does it mean? Where do you enclose members into a critical section?
"In a real OOP design, I am not interested in data members (and surely not in data layout)"
The above is exactly the point of the article. The OO in a certian way isolates the developer from actual memory layout and thus can lead to sub-optimal solutions. Other way round, if you exercise full control over memory layout it leads to violations of OO paradigm.
"Where do you enclose members into a critical section?"
I've added an example into the article.
"The OO in a certian way isolates the developer from actual memory layout and thus can lead to sub-optimal solutions"
IMHO getting away from the memory layout and working with abstractions is a benefit I am happily paying, if there is any. The mixin solution leads to a similar memory layout, though, and is more OO-style (with all OO-benefits), once one adds member functions instead of plain member variables to the mixin.
And the conception "low level == efficient" is not necessarily true, as Stroustrup points out in his presentation (google for Day 1 Keynote - Bjarne Stroustrup: C++11 Style) when comparing std::sort with qsort. Both implement a quicksort, but std::sort has more information for optimization because it knows the types and can inline function calls. I would only resort to low-level optimizations when profiling has shown that it is a hotspot and measure afterwards.
"if you exercise full control over memory layout it leads to violations of OO paradigm."
I would generalize this to "if you exercise full control over memory layout it leads to violations of many established design principles of any design methodology" because basing your design on implementations instead of abstractions is never a good idea, being it OO or not.
Yes. My claim was that using C++ for system-level programming (ZeroMQ) was not a very good idea. Once you get higher in the stack, C++ becomes more and more preferable. For rapid development where 100% reliability and utmost performance isn't that important it's almost perfect.
This looks interesting! Can you extend the example to show some list operations (for example how to add an element to the list)?