Previous: Does GPL hurt free software?
Next: Enclave Pattern
In my previous blog post I've discussed how the need for rigorous error handling in low-level infrastructure software prevents the usage of many fundamental C++ features (exceptions, constructors, destructors). The conclusion was that with such a severely limited feature set, writing the program in C yields shorter and more readable codebase. A side effect is eliminating the dependency on C++ runtime library, which shouldn't be easily dismissed, especially in embedded environments.
In this blog post I would like to explore the problem from a different point of view. Namely: Are there any performance implications in using C++ vs. C ?
In theory, the two languages should be equivalent. Object-orientedness is after all just a syntactic sugar on the top of procedural language, making the code more understandable to the human brain which seems to have evolved natural ability to deal with objects at the expense of dealing with entities such as flows, relations etc.
While the above is true — every C++ program can be automatically translated into an equivalent C program — the very ideology of the object-orientedness makes developers think in specific ways and design their algorithms and data structures accordingly. That in turn can have performance implications.
Let's compare how a C++ programmer would implement a list of objects:
EDIT: Assume that the object to be contained in the list is non-Assignable as is the case with any non-trivial objects, for example those holding large memory buffers, file descriptors, threads etc. If the object is Assignable simple std::list<person> would do and there is no problem.
class person
{
int age;
int weight;
};
std::list <person*> people;
C programmer would rather solve the same problem in the following way:
struct person
{
struct person *prev;
struct person *next;
int age;
int weight;
};
struct
{
struct person *first;
struct person *last;
} people;
Now, let's compare how this solutions look like in the memory:
First thing to note is that the C++ solution allocates twice the number of memory chunks compared to the C solution. For each element of the list there's a little helper object created. When there are lots of containers in the program the amount of these helper objects proliferates.
For example, in ZeroMQ creating and connecting a socket results in dozens of memory allocations. In the C rewrite of the library I am working on at the moment, creating a socket requires one memory allocation and connecting it results in yet one more allocation.
Obviously, the number of allocations has performance consequences. The time spent allocating the memory chunks may be irrelevant — in ZeroMQ it is out of critical path anyway — however, the amount of memory used and the memory fragmentation that ensues is pretty important. It has direct impact on how CPU caches are filled and, consequently, on cache miss rate. Recalling that access to physical memory is by far the slowest operation in modern computers may give you an idea what the performance impact would be.
That's not where it ends though.
The solution chosen has direct impact on the complexity of the algorithms. In C++ solution erasing an object from the list has O(n) complexity:
void erase_person (person *ptr)
{
for (std::list <person*>::iterator it = people.begin ();
it != people.end (); ++it)
if (*it == ptr)
people.erase (it);
}
In the C solution, erasure of the object can be done in constant time (simplified version):
void erase_person (struct person *ptr)
{
ptr->next->prev = ptr->prev;
ptr->prev->next = ptr->next;
ptr->next = NULL;
ptr->prev = NULL;
}
Is this lack of efficiency just an artefact of std::list<> implementation or is it inherent to the object-oriented paradigm as such?
Let's have a closer look at the problem.
The real reason why any C++ programmer won't design the list in the C way is that the design breaks the encapsulation principle: The implementer of the "person" class has to know that its instances will be eventually stored in "people" list. Moreover, if a 3rd party chooses to store it in a different list, the implementation of the person would have to be change. This is an anti-pattern that OO programmers learned to avoid.
However, if we can't place the "prev" and "next" fields into "person" class, we have to put them elsewhere. Thus, there's no other option but to allocate a helper object, the way std::list<> does.
Moreover, while helper object contains pointer to the "person" object, "person" object cannot contain pointer to the helper object. If it did, it would break the encapsulation — "person" would have to be aware of the container(s) it resides in. Consequently, we can convert pointer to the helper object ("iterator") to the pointer to "person", but not vice versa. That's the underlying reason why erasing an element from std::list<> requires full scan of the list, in other words, why it is O(n)-complex.
In short, if we follow the object-oriented paradigm we can't implement a list with all operations being O(1)-complex. To do that we have to break the encapsulation principle.
EDIT: A lot of people point out that iterator should be used instead of pointer. However, imagine the object is contained in 10 different lists. You would have to pass structure containing 10 iterators around instead of the pointer. Morever, it doesn't solve the encapsulation problem, just moves it elsewhere. Instead of modifying "person" object every time you would want to add it to a new type of container you would have to modify the iterator tuple structure.
That could be the conclusion of this blog post. However, the topic is interesting enough to ask one more question: Is the deficiency we are seeing inherent to object-oriented design as such or is it specific to C++ language? Can we imagine an object-oriented language that would allow for list with all associated operations being O(1) complex?
To answer the question we have to understand the underlying problem. And the problem turns out to be the very definition of term "object". In C++ "class" is just a synonym for C "struct". Both keywords can be used almost interchangeably. The implication is that "object" is a collection of data located in a continuous chunk of memory.
That is an automatic assumption that C++ programmers don't even start to question. However, let's try to examine the "object" from a different point of view.
Let's say that object is a collection of logically related data that has to be guarded by the same critical section to ensure its consistency in multi-threaded program. This definition radically changes our understanding of the architecture. The picture below shows the C person/people program (as listed above) and marks the data fields that should be guarded by a list-level critical section (yellow) vs. data fields to be guarded by element-level critical section (green):
For a person with object-oriented world view it's a pretty exotic picture. The "people" object is composed not only of the fields in "people" struct itself, but contains also some of the fields ("prev" and "next") in the "person" struct.
However unexpected, the decomposition makes perfect technical sense:
- Guarding all the yellow fields by a list-level critical section ensures consistency of the list. On the other hand, the critical section doesn't guard the green fields ("age" and "weight") thus allowing personal data to be changed without locking the entire list.
- The yellow fields should be accessed only by methods of "people" class even though memory-layout-wise they belong to the "person" struct.
- If the programming language would allow to declare yellow fields within "people" class, the encapsulation would not be broken. In other words, adding "person" to another list won't require to change the definition of "person" class.
Finally, let's do a thought experiment and extend C++ to allow this kind of architecture. Note that goal here is not to offer a perfectly designed language extension, rather to show that the thing is at all possible.
That being said, let me introduce "private in X" construct. It can be used within class definition. The data members following "private in X" would physically become part of structure X, however, they would be accessible only from the class being defined:
class person
{
public:
int age;
int wieght;
};
class people
{
private:
person *first;
person *last;
private in person:
person *prev;
person *next;
};
In conclusion, if ZeroMQ was written in C it would require less memory allocations, there would be less memory fragmentation and some of the algorithms would have O(1) complexity instead of O(n) or O(log n).
The problem behind the inefficiency is not in ZeroMQ code, nor is it an inherent flaw of object-oriented methodology. Rather it is a deficiency in the design of C++ language. And to be fair to C++, the same problem exists in most — if not all — object-oriented languages.
EDIT: Patrick Wyatt have written a more extensive blog post on intrusive lists. You can find it here.
Martin Sústrik, Aug 30th, 2012
Previous: Does GPL hurt free software?
Next: Enclave Pattern
This isn't so much a design flaw in C++ as STL. Using templates to embed the data type in the nodes (creating an intrusive style list) removes the problem of allocation count and allows the address of the data to be turned into a node address to fix up pointers without any fancy iterator stuff. e.g. template < class DataType > class ListNode { ListNode* next; ListNode* prev; DataType data; /* … */ };
No it's not. It's an STL compromise, where the simplest possible case becomes slightly more complex and inefficient in C++ versus a custom-coded C solution. And for a linked-list this works.
For a non-trivial datastructure, like any kind of binary tree, a map, or even a simple vector, I would argue that C++ gains the advantage.
Plus look at maintainability afterwards. Modifying List<Person*> into Vector<Person*> in C++ … that's 5 minutes work, tops. In C you'll be coding all afternoon to make this change unless you're extremely practiced, and you still won't get it in 5 minutes. Making a red-black tree out of it in C is very likely to contain at least one hard-to-find bug, whilst for C++ you'll be done in minutes.
Macros can solve this, but that's like smearing yourself with honey and diving butt-naked in an ant's nest.
I just realized that the "standard" linked list implementation in C : glib's double linked list (google for the docs, I can't post links) … does exactly the same thing as the C++ implementation, losing type safety. Glib uses a helper object with a void* pointer in there.
+1
GLib basically just craps itself when it runs of memory, making it laughable from the perspective of resilient code. Their attitude is:
"If any call to allocate memory fails, the application is terminated. This also means that there is no need to check if the call succeeded."
"no need" - OMG. So, every single program using GLib can fail at any time, either through its own stupidity or any *other* program's attempt to grab too much RAM. The irony is that the other program may not be using GLib, and could easily survive and go on to use the memory freed up by the GLib program imploding. Which offers an easy to to get more RAM - grab all that's left and wait for a bit voila! More memory.
GLib's memory exhaustion handling is a joke.
There's nothing else they can do. Modern OSes allow for memory overcommitment anyway, which means that allocation may succeed even if there isn't sufficient memory available.
Even if that wasn't the case, what would you do if out-of-memory condition happened? Any attempt to recover may run out of memory itself. For example, years ago I've seen closing a file fail in low-memory conditions.
Additionally, when you are running out of memory, OOM killer is probably already in progress and it may very well choose your application to be killed.
In short, memory allocation errors are the primary candidates for "fail fast" strategy.
This is a practical work-around, not a theoretical solution, but FWIW, you could allocate a "spare" memory pool you can borrow from when memory exhaustion becomes a problem.
It will not protect against all conditions, especially not a memory leak in another running process, but it could keep things going long enough to print dire warnings to the console, stderr, etc, and optimistically, allow your app to survive long enough to realloc() to restore the size of the "spare" memory pool.
What it will do is give you something useful to do to recover from malloc() when it returns a null - namely realloc() "spare" to something smaller to free up the memory you need. Hopefully.
To cut reasoning short, if you managed in your C++ part of the code achieve O(N) where O(1) is expected, you got something badly wrong. Correct your design, and for free you will get rid of the nonsense "not-really-C++" code above.
As a writer of low-level C++ code which often avoids std:: containers in places where it matters, I think the problem is somewhat correctable. Boost (and I don't count my self as a big fan) intrusive attempts to solve this problem by allowing some fields into your structures. My attempts to do this (yes, involving templates) are often less sophisticated than boost intrusive, but perhaps a little closer to the metal and simpler.
Often complaints about a language really boil down to complaints about the commonly used libraries in that language. I think its fair to say that std::list is a mainstream choice for solving the problem. But if you really care about counting your allocations and using thread aware allocators, memory pools, and optimization — looking beyond the mainstream seems necessary. In my opinion C++ is not the problem, its the libraries. If the libraries for language are awful, that is a great reason to not use the language (since we don't have time to contribute libraries which fix the problem — right?).
Quite honestly, std::list should be avoided in most code, as other containers will usually outperform it nowadays in all aspects, and even when it doesn't (like when coding a networking library, or applications requiring heavy I/O), then you'd be better off using your own linked list.
Conclusion… never use std::list. But to go as backwards as to use C… I think it's taking it a bit far.
The problem is not having used C++ instead of C, but having used some containers when they weren't appropriate. The main difference between your two approaches here are intrusive vs non intrusive containers, and that doesn't depend on the language.
In your C case, you built an intrusive list. If you build it like that, you'll have to reimplement a list every time you want to make a list of something other than a person.
In your C++ example you used a non-intrusive list. They have their advantages (if you want to use the same pointers to people in multiple lists, it's easier to do so in a non-intrusive list.
You could have used an intrusive list in C++ (see boost for an implementation if you do not want to roll your own), and with the help of templates you could have an intrusive implementation that wouldn't have to be rewritten for each listable class. So using C++ would have been an actual advantage, while maintaining the performance/memory usage of your C example.
The mere fact that std::list exists makes it easier for one to use it for all cases, and it leads to possible wrong design decisions such as the one you pointed in the article. They are appropriate for some use cases, and inappropriate for others. So I wouldn't blame it on C++, not in this case.
my crystal ball says : in 99% of the cases using std::list for making arguments about C++ vs C is trolling or ignorance.
llvm dot org/docs/ProgrammersManual.html#dss_list
But as a C++ dev I can give you examples of C++ sucking compared to c:
vector<vector<» :…
Fixed in C++11.
The solution to the first problem is to use a list of Persons, not Person pointers (i.e. std::list<Person>). Then the template-generated linked list node will look something like this:
No double memory allocation. In fact, it's almost exactly like the "private in X" construct that you're proposing.
Then, to refer to list elements, use a std::list<Person>::iterator instead of a Person*. You can pass this to the list erase function for constant-time erasure. If you think about it, an iterator is equivalent to a pointer (* and -> get you a Person), but it also lets the implementation access the linked list node.
Thanks. I've already edited the text to make it clear that "person" is non-assignable.
As for using iterators instead of pointers, imagine the object is contained in two lists. You would have to pass pair of iterators around. If it's contained in 10 lists, you would have to pass structure containing 10 iterators around. And you are back to encapsulation problem: For every new list you would have to adjust the "iterator tuple" object.
That argument doesn't hold.
If a person were in 10 lists then in your c-implementation you would need 10 pairs of prev and next… That would be an even worse design choice for so many reasons, I think!
Yep. That's the way it is done in Linux kernel for example.
Yes, I think it was definitely a defect in the STL that list elements had to be assignable. They could have had a version of push_back() that took no arguments and instead default constructed a new element (then you could use back() to get to it) but they didn't.
C++11 fixed this by adding emplace_back(), which also uses new C++11 features to let it take any number of arguments which are forwarded to the element's constructor. Given how new C++11 is, I can understand why this might not be an option for most people.
Regarding the iterators, I don't think the C version would have better encapsulation - the encapsulation problem is just elsewhere. I suppose it's a matter of taste where you'd rather have that problem.
The "private in X" idea could solve the encapsulation problem, I suppose, but only if you made a distinct type for each of the 10 lists which a person could be part of - otherwise the compiler wouldn't know to add 10 prev/next pointers.
And that's the cause in most cases, isn't it? Having same object in the same list twice is pretty rare.
I'm talking types, not instances of a type. Having the same object twice in the same list instance is rare, but if you have the same object in 10 different list instances, what would the type of each list instance be? You probably want it to be people in every case, but then the compiler would add only one next/prev pointer to the person type, which isn't enough. There's no way for the compiler to know that at runtime the person will be stored on 10 different lists.
I'm not sure there's a good way to store an element on multiple lists without breaking encapsulation somewhere.
Ah. I see. What we are representing then is an relation between two types of entities. Some kind of 2D bitmap or similar would be more appropriate then rather then a set of lists.
Did you try std::list<person> ?
This is exactly the point of the Boost "Intrusive" containers, which are both delicious and exceedingly simple to use. Check them out!
Yes. They break the encapsulation principle as well. My point was that in C++ you can't have an "intrusive" container and proper encapsulation at the same time.
What you're talking about is containment, and it's entirely possible to implement containment without breaking encapsulation, but your code looks more like composition.
You and other parties don't have to know about each others' implementations if you agree that containment might be important and agree on a containment protocol.
Read the GoF! ;)
If you believe it's possible, please post a short code snippet of person & people classes that doesn't break the encapsulation.
Listen, if you're going to write C++ in the manner one would write C, you're going to run into these problems. C and C++, though they share similar syntax and semantics, are fundamentally different languages. You appear to be writing C++ as one would write C, and this appears to have caused you an unnecessary amount of grief.
Simply put, a sane C++ programmer would simply use a `list<Person>` instead of a `list<Person*>`. This yields a memory layout that is almost exactly equivalent to the C version. Dereferencing an iterator compiles down to an add instruction.
Yeah, I can do it incorrectly in C as well.
In C++, you should iterate through lists using the appropriate const-ness iterator, e.g.
for (ListT::const_iterator i = list.begin() ; i != list.end() ; ++i) {
…
}
You can then use ‘list.remove(i)` to remove a specific element in constant time. This is equivalent to your C code. You shouldn’t be wrapping it up into a function which discards all context (and if you must remove elements by-value, at least use std::list<>::remove).
Yes, this means you cannot wrap your C++ into C-like interfaces and expect good performance. You can't wrap your C into Python-like interfaces and expect good performance either.
To conclude, ZeroMQ shouldn't have been written in C++, not because of the performance characteristics of the language, but because the author refuses to write C++ in a performant manner. If you're going to write in C, by all means, actually use C.
wow, nice comment :)
I, being a developer with java background, was scratching my head thinking "c++ definitely has a way to remove an element in constant time". Thanks for clearing that up.
The complexity of std::list<>::remove is not constant, but linear in container size (comparisons).
How come?
Linked lists almost always suck compared to vectors because the way they lay out memory causes a lot of cache misses and because iterating through them is so expensive.
http://www.youtube.com/watch?v=YQs6IC-vgmo
Yes. But vectors are pretty bad a resizing. It's a trade-off.
It sounds like you're pretty much against C++ regardless of any actual corrections to misunderstandings you may have about it (C++11 fixes quite a few defects). I agree that it's a complicated language and it's easier to shoot yourself in the foot than with other languages (however this is even more true with C).
First, I'm not sure what you mean by bad at resizing. What particularly about it is bad?
Re vector vs list, cache misses are a drastically significantly more important performance impact than anything else these days & you're guaranteed them when using a list unless you do lots of jumping through hoops to allocate everything in a contiguous array (but then why not vector?).
I highly recommend you watch http://channel9.msdn.com/Events/Build/2014/2-661 (the performance stuff starts roughly around the 20 minute mark) & is language agnostic (true for C++/C/Java, etc).
Additionally, in C++11, if you declare your move constructors noexcept (or use the default one or omit destructor, move/copy contructors & move/copy assignment operators), then resizes of vectors can be much faster (if the move constructor is faster than the copy one) since the container doesn't need to copy your objects, it'll just move them.
C++11 lets you also express your non-assignable types without resorting to pointers by declaring the type to be move only (thus Person can live on the stack or moved to live in the heap):
Additionally, it's unlikely that Person would actually own anything. Instead you can omit all the constructors:
This post again shows that you are the problem, not the language. Why are you using "std::list <person*> people;" instead of much simpler "std::list <person> people;"? If you are worried about construction cost, use emplace_back.
You should rather:
std::list <person> people;
The equivalent of the pointer in your C program is an iterator std::list<person>::iterator which *does* allow erasing in O(1). What you don't show in your C code is how you got the pointer to the element in the first place, traversing the list?
Now you get the same or better performance with more safety/higher level of abstraction.
Your cpp and c snippets not equivallent. Results of using them are different in case when list contains same pointer twice(for whatever reason).
This statement is incorrect: In C++ solution erasing an object from the list has O(n) complexity
This one correct: In C++ solution erasing an object from the list has O(n) complexity by object value
In C++ solution you should use iterator, not pointer. Erasing an object from list has O(1) complexity by iterator
P.S. you have an error in C++ snippet. people.erase(it) method invalidates iterator, so ++it it is undefined behavior
Text modified to address both comments.
Did you update the C implementation to allow for person to be in multiple lists?
Nope, but it looks like this:
struct person
{
struct person *list1_prev;
struct person *list1_next;
struct person *list2_prev;
struct person *list2_next;
int age;
int eight;
}
1. That multiply-linked list element is truly horrible and non-scalable, if you need to later add another list, you're going to have to keep adding next/prev pairs.
2. If you've got an object in 10 lists, why on earth are you iterating all of them together? That sounds horribly like a contrived reply to valid criticisms.
I come away with the feeling that you've not invested the time to learn C++ design properly, and then blame the language for your horrible implementation.
1. is the point of the article. Due to the described encapsulation problem there's no way to scale the solution in C++.
The "private in" construct would solve that. I am also said that Scala has a similar construct.
If I understood your explanation correctly, "Private in" is called "friend" in C++. Please look up the syntax in the C++ book/site of your preference.
It's not. Please, re-read the article.
sure you can keep encapsulation, write idiomatic C++ and not be left with the overhead of any superfluous run-time overhead:
Find code on codepad (dot) org (slash) MqAloPgJ
and whats more you get list elements that remove themselves when they are deleted resulting in cleaner, safer code.
From the article: "The real reason why any C++ programmer won't design the list in the C way is that the design breaks the encapsulation principle: The implementer of the "person" class has to know that its instances will be eventually stored in "people" list. Moreover, if a 3rd party chooses to store it in a different list, the implementation of the person would have to be change. This is an anti-pattern that OO programmers learned to avoid."
See, the object needs to know that it will be part of the list:
class important_object: public list_node<important_object>
I was wary about using the word "encapsulation" to describe the OO-ness that the code preserves.
The point is, you can write an intrusive list that is not intrusive in your client code. Yes you have to inherit from the node class but (my implementation is not full, but Google it someone has probably done it), but the important point is if you were to need to change the design and re-factor your code it would be relatively easy, depending what you are doing of course you will probably only have to touch the class definitions not the client code. It does not "degenerate to C style"
Talking about what "a C++ programmer" would do is really a straw man argument. The point is C++ generics provide the tools to represent any abstraction you might think of and do so without any run-time overhead. Yes the implementation of the patterns can look pretty scary (in large part to the verbose template syntax) but usage usually has acceptable to good syntax.
I have been reading a lot of language specifications and papers on different type systems and programming paradigms recently and the biggest surprise that I have had from the whole project is that it turns out most of the ideas can be implemented in a usable way as a C++ library. Think very carefully before you say something can't be done.
Finally if you have a case where your code is easier to read and maintain and more efficient in C style then do it, it certainly doesn't mean that your whole project has to "degenerate".
Why do I care enough to write this?
answer 1: C++ provides the tools to let you deal with abstractions at compile-time, C doesn't. In C you must therefore either do without the abstraction or use run-time abstractions leading to just the sort of indirection this article seeks to avoid.
answer 2: I'm a geek
answer 3: someone is wrong on the internet ;¬)
PS I have very much enjoyed using ZeroMQ
:)
Ok, I guess you are into a more theoretical discussion. Here's my take then:
What OOP delivers is a way to decompose the universe into well-delimited objects. The appeal is that human brain is wired to deal with well-delimited objects and thus OOP allows common sense to kick-in (rather than pseudo-scientific reasoning required by non-object-oriented code).
In other words, if there's a box of chocolates, common sense assumes that these are two well-delimited objects and there's nothing chocolate-like about the box and nothing box-like about the chocolates.
So far so good. There's nothing wrong about OOP paradigm.
Enter C++. C++ have evolved from C and re-used C "struct" construct to define classes. What that means is that "well-delimited object" maps to "continuous chunk of memory" in C++.
This tight coupling between "object" and "memory allocation" works well in most cases, however, in few special cases it goes astray. When thinking about lists, the only way to separate box properties from chocolate properties is to allocate 2 memory chunks instead of a single one per chocolate in the box. That's basically what STL does.
Intrusive containers merge the two chunks into a single one, but then some of the properties of the box (prev,next) leak into chocolate object. Which, of course, violates the basic promise of the OOP paradigm (well-delimited objects).
I would say there's no way out, unless the language doesn't enforce tight coupling between objects and memory allocations.
"but then some of the properties of the box (prev,next) leak into chocolate object"
Intrusive containers' helper objects (the equivalent to my list_node) can quite easily make their methods private but share them by friendship with the list and iterator classes so that the appear from the outside as a normal container class. say "intrusive but not obtrusive"
"I would say there's no way out, unless the language doesn't enforce tight coupling between objects and memory allocations."
Not necessarily but it is the easiest way of telling the compiler how the data should be laid-out. If you break the coupling between object and memory structure you will end up having to look up every property like Ruby and its dynamic bretheran, and that is a _lot_ slower.
Could you point me to the actual code where this is a problem. Talking in the abstract and with toy examples can end up going 'round in circles.
I wager I can find a C++ solution that will deliver better encapsulation, making your code safer, more expressive and easier to maintain, without compromising runtime performance when compared to the bare C solution you describe. And that even in this case to which you feel C++ is particularly unsuited.
I say this with all humility, knowing I am not an equal to you as a software engineer.
Ok, let's go for it! :)
My problem is as follows:
I have many classes in the project, each has many members. I have also few mutexes. Each mutex synchronises some (but not all) members in some classes. Say mutex 1 synchronises A::a, A::b, B::i and C::x, while mutex 2 synchronises A::c and C::y etc.
Now, the challenge is:
"can quite easily make their methods private but share them by friendship" — I've actually written a blog post about this: http://www.250bpm.com/blog:9
"My problem is as follows:…."
Can I see the actual code? I searched zeroMQ for use of std:list and came up empty handed.
"I've actually written a blog post about this"
Something like that but the next and previous pointers in the helper class need to point to person objects because you can't get to the person class from the helper class and all you are left with is a linked list of helpers (I guess strictly you could probably do some black magic where you find the offset of the helper data I'm assuming that's not what you wanted to do).
0MQ code is to convoluted to be discussed here, but try adding proper encapsulation to this:
As for the black magic, casting from member to the container is a standard way of doing things in C. Check e.g. Linux kernel sources.
black magic: that's interesting - does that allow you to access a member of a struct via a pointer to another member as efficiently as if you had a pointer to the struct itself or is there another step involved? I would also be interested to find out whether in any circumstances this sort of trick prevents optimizations by obfuscating what we are doing to the compiler… very interesting.
"If possible, make compile complain when synchronisation is done incorrectly"
that sounds like a job for typestate. Implementing some approximation of this with C++ metaprogramming sounds like a very interesting project (and as I think of it, some new C++11 features, in particular type erasure and variadic templates might help a lot)
"0MQ code is to convoluted to be discussed here, but try adding proper encapsulation to this"
I'm sorry I just don't have the context to understand the problems with this sort of example. If you can point me to where I can find the code, which classes in particular are problematic (what corresponds to A B C and D) then I will try to deal with the complexity or admit defeat.
black magic: It compiles into a single subtraction operation, so it's very quick. Also, it's done on regular basis in linux kernel so my guess is there is no serious performance problem like messing with optimiser. If there was, it would have been fixed long time ago.
"that sounds like a job for typestate. Implementing some approximation of this with C++ metaprogramming sounds like a very interesting project (and as I think of it, some new C++11 features, in particular type erasure and variadic templates might help a lot)"
Go for it. It would be interesting to see what C++11 has to offer in terms of decoupling memory layout from accessibility settings.
"I'm sorry I just don't have the context to understand the problems with this sort of example. If you can point me to where I can find the code, which classes in particular are problematic (what corresponds to A B C and D) then I will try to deal with the complexity or admit defeat."
0MQ goes to great lengths to avoid this kind of problem. See the convoluted algorithm for managing list of pipes in lb_t and fq_t classes. The list stores pipe index (position in the list) inside of the pipe, however, pipe is not necessarily living in the same thread as the list. What you get is members that are living in different threads packed without any safeguards into a single class. There are lock-less algorithms involved etc. so it's really top complex system to demostrate theoretical issues on.
The ABCD snippet in the previous post is much cleaner example. The only goal is to ensure that individual variables are accessed only by their owners, i's by A, j's by B. Attempt to access say C::i from B or C should yield a compile-time error.
The only solution I can think of is along the lines of enclave pattern, I've described in my blog. However, that's hardly an idiomatic C++ and leaves me asking whether using plain old C is not a better solution.
…ok this is interesting.
I thought we were talking about a linked list and that the problem was simply going to be one of choosing which intrusive linked-list implementation best fitted the problem.
This however is a much less well defined problem, but quite interesting.
What is going on in the code now is clearly not ideal.
That array class is crazy, just some thoughts so far, no need to respond:
1) why do we want random access to an array which does not
guarantee order is preserved? certainly it doesn't seem like a priority
that this is O(1)
2) pointers can be invalidated because deletion of objects does not remove them
from the array - this leads to a need to always check the pointers.
3) pointers can be invalidated because we are allowed access to the pointers
by reference
4) the array indexes can be invalidated because set_array_index is public.
5) The comments claim that insertion in O(1) but this is only average time.
Worst case time is bad because push may require new memory allocation and
copying of the array.
An intrusive linked list could actually solve all of these problems, though iteration would obviously be slower and if there is some reason why you need O(1) random access via index then that would be a problem too. Intrusive linked lists do not require any allocations to add existing nodes to a list, for this reason they are good for "real time" applications.
On the broader question of how to design the locking properly I am going to have to study the code more and review the literature in this area. I have a few ideas.
I'm gonna stop myself babbling, I will probably go quiet for a while but I will try to come back to you with something useful.
OK. Give it a stab.
However, I believe the problem is inherent: You can improve the code until it is cleanly implemented intrusive list. Then you have parts of the list object leaking into item object (next, prev). You need to ensure that these won't be used by anyone other than the list. So you'll probably go with something like the enclave pattern. At that point you should look on the code and ask: Are all these hacks used to get something that C++ promises out of the box (member access control) really worth it? Wouldn't it be cleaner to use raw C instead?
Actually you're wrong about the extra allocations with stl containers/iterators. Using vector<person> (not the lack of *) I had exactly 0 overhead. The class is exactly 8 bytes http://ideone.com/C4j0BG
Using list I get 24 bytes. I'm not sure where the extra few bytes comes from but that may be solved using a different implementation. Templates/oops doesn't cause extra allocations http://ideone.com/qGp2Am
The assumption is that the object is not assignable (imagine object that contains a running thread). Thus, there's need to store it in the list by reference, rather than be copying it.
Regarding your clarification on iterators - it is not entirely correct. With the proposed C structure you cannot have it in the several lists, because next and previous fields can make it part of only one list. So using iterators in C++ is still a viable alternative.