Previous: Poor Man's Scientific Method
Have you ever seen a function that spans 1500 lines of code?
At that point every semi-decent programmer curses spaghetti code in general and the author of the function in particular and embarks on the task of breaking it into managable chunks, trying to decompose the problem into orthogonal issues, layer the design properly, move the common functionality into base classes, create convenient and sufficiently generic extension points et c.
If done properly, they'll end up with well-defined components, short functions that do one thing only, and generally, with a nicer and more maintainable code.
Everybody is happy and everybody pats the semi-decent programmer on the back.
…
It turns out that the 1500-line function was parsing a network protocol. It is a 30-year old, complex and convoluted Behemoth of a protocol, defined by many parties fighting over the specification, full of compromises and special cases, dragged through multiple standardisation bodies and then anyway slightly customised by each vendor.
At some point, the product is sold to a different customer who happens to need integration with a software from a different vendor, Foosoft Inc. As already said, there are minor differences in the protocol among the vendors. The parser has to be slightly altered.
Unfortunately, it turns out that the tweak intersects the boundary between two well-defined components in the implementation. The right thing to do would be to re-think the architecture of the parser and to re-factor the codebase accordingly.
However, the discrepancy in the protocol is really a minor one and has to accounted for by tomorrow. No way the programmer is going to sell the idea of re-factoring the whole parser to the management. Thus, he goes for the minimal change and modifies the following function:
int new_sequence_number (int old_sqn)
{
return old_sqn + 1;
}
in the following way:
int new_sequence_number (int old_sqn, int foosoft)
{
if (foosoft)
return old_sqn + 2;
else
return old_sqn + 1;
}
Later on it turns out that the problem of non-consistent sequence numbering was recognised by yet another standards body (Bar Consortium) which released version 5.34.2 of the protocol that actually unifies the algorithm of generating the sequence numbers. Now we have to support version 5.34.2 as well:
int new_sequence_number (int old_sqn, int foosoft)
{
if (foosoft == 2) // version 5.34.2 of the protocol
return old_sqn + 3;
else if (foosoft)
return old_sqn + 2;
else
return old_sqn + 1;
}
At this point you should be able to see where I am heading: After few such iterations the code that was originally nice and manageable becomes a mess of short functions plagued by lots of behavior-modifying arguments, the clean boundaries between components and layers are slowly dissolved and nobody is sure anymore what effect any particular change may have.
Let's imagine we want to make the above function nicer:
int new_sequence_number (int old_sqn, int step)
{
return old_sqn + 1 + step;
}
It looks all right but then it turns out that foosoft=4 is already being used to identify yet another vendor who uses sequence numbering step of 2. The beautification have actually broken the product! The patch is hastily reverted.
At this point everybody wishes they had only a single 1500-line long parsing function that would be much easier to understand, debug and modify than this mess of dozens of interconnected abominations. It's too late though to do that. Nobody dares anymore.
…
The point I am trying to make in this article is that sometimes it's the problem domain itself that's complex and convoluted. In short, it's a spaghetti domain.
Human-crafted and gradually evolved domains are often like this. Think of law. There are more exceptions than there are laws. Think of all kinds of standards. Regulatory directives. Products offered by a big bank. Different tastes os ARM microarchitecture.
In such cases you should seriously consider whether not even trying to break up the problem into fine-grained orthogonal chunks isn't the right thing to do. You may end up with long functions that will be frowned upon as spaghetti code by everyone who ever looks at the code, but you will sleep peacefully, knowing what kind of manageability nightmare you have avoided.
PS. If you happen to be a developer of a static analysis tool: Consider leaving long functions alone, rather then reporting them as suspicious. There are circumstances where long functions are perfectly legal and desirable.
EDIT: Long functions with gotos are as bad a mess of little functions. Luckily though, 45 years after Dijkstra's paper, goto is not used extensively any more.
EDIT: The discussion about this article on Reddit gave me an idea of good example of a spaghetti domain: Imagine you are modelling buildings in software. First you model Empire States Building. It's rectangular, it has 103 stories, most of them almost identical. Individual features are repeated over and over. etc. It's a nice example of orderly domain. It can be described in few simple functions. Then you try to model Kowloon Walled City. Any possible generic rule you try to make about it was violated by some poor chinese emmigrant trying to survive in the compound. No feature is repeated. Ever. In such case it's better to draw a map (as the japanese speleological team that explored the building did) and put every little detail in there.
Martin Sústrik, March 25th, 2013
Previous: Poor Man's Scientific Method
I've never found any web of interconnected components so bad that I'd prefer to have a 1500-line function.
Sure, rushed bugfixes make code worse. But there's nothing magic about 1500-line functions that means a rushed bugfix is not going to make it worse. You're just less likely to notice because by the time you've allowed a 1500-line function you've given up trying to sanely maintain the system.
That's the whole point. You can give up immediately. There's no way to sanely maintain something that is insane in the first place. The body of law is a pretty good example of the phenomenon. It's too complex and convoluted to model it in a nice and sane way.
This sort of reminds me of Rodney Brook's commentary on essential and accidental complexity. Once you have got rid of the accidental complexity, you are left with the essential complexity, which cannot be further reduced without compromising on functionality.
You can shovel the essential complexity around your system design all you like, but the overall quantity of complexity still remains constant. A practical consequence of this is that you can move the complexity away from the 1500 line function by splitting it up into smaller functions, but that complexity remains embedded in the implicit relationship that exists between all of your 50 line functions. (Plus some additional overhead from the function declarations themselves).
Of course, splitting up the large function into smaller ones gives you other benefits: Primarily the ability to test each small function in isolation, but also (and more importantly) the ability to read and understand the scope and purpose of each small function without being confused by irrelevant details.
Personally, I would like to have the ability to give names to chunks of code within a large function - and to write tests for those chunks without having to extract them into separate smaller functions.
I would also like to see better tools (and widespread usage of those that exist) for building various graphs and diagrams showing the call-graph and other relationships between functions and objects, so that we can explore, understand, and communicate the implicit complexity encoded in program structure more easily.
Generally, agreed, but:
In the examples above I've tried to explain why expecting each function to nice and neat semantics is naive. As the system evolves the existing decomposition into functions is going to gradually break (because the domain itself is spaghetti). You'll end up with functions with 25 boolean arguments, each saying 'do this little thing a little bit differently'.
Even worse, the concept of what is each function designed to do will slowly dissolve because of the said changes. At that point the conceptual model of the application remains only in your head. Once you leave the company and the code is assigned to a different developer, he's not going to understand what are the functions supposed to do and wish you have just written a single 1500-line function that's at least kind of linear and easy to follow.
Sorry for the pedantry - wasn't it Fred Brooks who wrote the Mythical Man Month and Rodney Brooks is into robots?
Good point. I often look at problems through the "SOA architecture" lens where it is encouraged to think of all the "loose coupling" and "tight coupling" between things and to try to concentrate as much of the necessary "tight coupling" in a few choice places so that everything else has very loose coupling and is thus more flexible and composable.
In other words, in a situation like this, much of "the spaghetti domain" probably gets concentrated in this hypothetical 1500 line function so that most of the other functions will end up being a lot cleaner. To break up the 1500 line function into a whole bunch of components will create a crazy functional/inheritance diagram between dozens of classes that don't really have to do with the problem domain.
Yes. Great way of putting it!
There are several things that I consider wrong on this approach.
First, the text doesn't mention it explicitly, but it seems that the issue was detected when the new code was already in production. It should have been picked up in a staging environment, not in a production environment.
Second, when you encounter a huge-ass function, the first thing you have to do is look for tests. If there are no tests, the first thing you must do is write the tests - from whatever documentation / knowledge there is. *Then* you can start refactoring.
If the code had been passed around several times and it had no tests and no docs, chances are that you will get bitten by "implicitness" like the foosoft=4 case.
But the responsible way of doing things not "reverting the patch and leaving it like that". You add another test for the one-off case (foosoft=4), then redo the "beautification" (more on that later) and then you push those changes to the staging server for testing. And after they are verified, you can move them to production (even then, you can add them gradually, with feature flags). And if new "implicit cases" arise, you add tests and refactor until everything is discovered and nothing remains implicit.
Finally, the article seems to infer that refactoring is "about the looks". It mentions "code looking nice" and "beautification". The point of refactoring is not beauty; it's utility. Code is refactored so that it is *better at communicates what it does to humans*.
A proper refactoring of the function would have taken this into account.
The important thing this function does is *not* calculating the new sequence; it's deciding whether what kind of sequence we're calculating. The particularities of how to calculate it are left to other functions. By using explicit names we make the code more *clear* (not necessarily more beautiful or elegant). There's no "implicitness" here.
If you "leave the big functions untouched" you are part of the problem, and a not particularly good programmer. Sure, if an impending deadline forces you to, take shortcuts, but reserve some time later on to "pay the debt" you incurred into.
OK, here's the story:
You've inherited the code in question. The particular case foosoft=4 was written 20 years ago by a guy who died in the meantime.
You find the problem and you see the code is in terrible shape and you are able to convince your management to spend say 1 manyear on decyphering the code and writing the tests. Then you convince them to allocate one more manyear to refactor the code in a clean way.
Once done a deployed, you get a new change request from the customer. But alas! It cuts straight through your well-defined functions and components! (Remember that this is inevitable: The domain itself is spaghetti such as regulatory directives or telco protocols.)
At this point you'll try to convince your manager to allocate more time on refactoring. The answer would be: "You've got 2 manyears to do refactoring. You've finished two weeks ago. And now you asking me for more time to refactor? Shut up and do your work."
And he is right. Refactoring every two weeks is not a realistic option. In such environment, simply keeping the code in a single function is the best bet: It's at least linear and relatively easy to understand and modify.
You are using strawman arguments here.
First you talked about a 1500-line function. Then you say that you need a 1 manyear to decypher and write the tests. That does not match.
If it's a huge library with lots of functions, you don't need to do everything in 1 go. You can refactor as you go: when you need to "touch" one function, write the tests for that individual function, and then refactor.
Saying "the domain is spaghetti" does not mean anything. All domains change. All domains that are worth writing programs about are complex.
Code can't be done like a building, or a bridge. Bridges have the Laws of Physics. Code lives in a changing environment. You can't build a bridge if every moment gravity changes, and the next one steel becomes jelly.
That's why they pay us: to make code that not only works, but adapts to changes. That is the whole point. You must know how to make code that adapts. If you don't, search for help - read, go to workshops, or do something. It is your job to know how to do those things.
The code you showed is not a product of "the domain". It's a consequence of bad programming practices, and nothing else (with the possible exception of badly informed management).
Another strawman argument. I never said refactoring every two weeks. I say refactor *as much as you can*; not every 2 weeks, every day.
And explain the problems of not refactoring to your manager, which is also part of your job.
Your manager's immediate job is to increase the ROI, but in a larger scale he also should make informed tactical decisions. Your job is to provide him with all the information to make those decisions.
Let me present you both scenarios.
If you add tests and refactor, at the beginning (and for some time, depending on the size of the library) you will go more slowly. But sooner or later adding features will become easier, you will get less bugs. And you will be happier. If you end up leaving the project, a new programmer picking it up will have lots of help from the tests.
If you just leave the big functions be, you will eventually get tired of maintaining that monstrosity. And then you will start looking for other jobs. And then you will leave your company, and your manager will be left with a new programmer, who won't have any idea of what the code does or why. The cycle will repeat.
Adding new features will get increasingly more difficult, and bugs will appear with increasing frequency. At the end either the company will lose their clients or they will have to abandon the product and start from scratch.
Choose.
It looks like you have never worked in a spaghetti domain.
If you are really interested, look for example at telco protocols at 3gpp.org You'll find thousands of pages of completely incoherent stuff.
If you value your time you should rather skip that and think about a similar system that we are all familiar with: the legal code. Once again, thousands of pages of incoherent and sometimes even contradictory rules. Then yet more pages to paper over the inconsistencies. Et c.
The problem with modeling such domain is that no rule really holds. Any assertion you'll make will have exceptions.
Now, without universally valid rules, there's no way to decompose the problem into smaller, manageable chunks. Any boundary you draw between components will be violated by some peculiarity from the domain.
From my own experience, what happens in reality is that people start modelling a little subdomain of the whole spaghetti domain and they find some generic rules that hold. They decompose the application according to those rules, only to find out that all their assumptions are violated when the scope of the project extends a bit out of the little subdomain. They may refactor at that point. But then the scope widens again and again and nobody really wants to do any more refactoring, so it all ends as a tangle of incomprehensible functions.
What I am trying to say is that you sometimes have to acknowledge that the domain you are modelling is a mess and will remain to be a mess. In such environment it's better not to try to be smart and go for the simplest possible modelling paradigm (e.g. a long function with a lot of 'if' and 'switch' statements).
This problem is very common in device drivers. Vendors clone each others chips, and each rev of silicon introduces new warts, features, and outright bugs.
The tulip.c linux driver used to be a case in point — it was several times as large as it would be if it just supported one variant of device — it was more special cases than actual code flow. (I've not looked at that code in over a decade.)
When I wrote drivers for similar chips for Solaris, I quite consciously decided that I would *not* try to cover all the variants in a single driver.
Sometimes, its easier to just copy the source to a new driver to support the new chip (or network protocol!), and edit for the new without worrying about preserving compatibility with the old. Its also *safer*, since you've not broken anything that *was* working, and your new test matrix only has to consider the new device/protocol/case.
Polymorphism is great. It also kills. Don't be afraid to just *clone* instead. :-)
(Btw, this is why my Go version of SP protocols will never have support for ØMQ. If someone wants that functionality in Go, they should just copy my project, modify it for ØMQ, and make no effort to preserve functionality with SP/nanomsg. (I might actually do that work myself.)
Great example, thanks!
The interesting topic to think about is what tools do we have in our bag to deal with spaghetti domains. Only two of them come to mind readily: limiting the scope ("we'll deal with only this little part of the domain and deliberately ignore the rest") and using copy+paste instead of more sophisticated problem decomposition tools.
Would refactorings - "introduce interface", and so on work here? Code in a predicate to that abstraction so the right one can be activated, and avoid the if/else stuff. Not a new idea at all - https://www.jetbrains.com/idea/webhelp/extract-interface.html - relates to plugins, DI, and also BranchByAbstraction (but that last is more about migration).
And yes, I've worked in spaghetti code before now.
The blog post is confusing, because it sounds like it is arguing for a conclusion something like "the domain is spaghetti-shaped, so the code should be spaghetti-shaped". Some commenters are trying to point out how silly that argument is. Clearly "the domain is red, so the code should be red" or "the domain is visual, so the code should be visual" or "the domain is three-dimensional, so the code should be three-dimensional" are analogous, false claims.
In fact, we should be focusing on the dynamics of the domain - how did it change in the past, and how do we anticipate it changing in the future, and which software architectures support which kinds of dynamics.
For example, imagine a domain that changes for various complicated, human reasons that nevertheless preserves a particular rigid concept like a buoy floating on choppy water. Factoring that rigid concept out into some structure (a function, class, module, whatever) is probably a good thing to do.
Similarly, when the blog post tells about a change that cross-cuts the existing architecture, leading to no good short-term options for evolving the system, the problem is that the (badly refactored) architecture mismatches the dynamics of the domain. However, that is not an argument FOR spaghetti code - because the spaghetti code may also mismatch the dynamics of the domain, and there may be other (more modular) architectures that do match the dynamics of the domain.
For example, if there are a mountain of shipped devices that will not be upgraded, and we need to be compatible with, then the existence of that mountain may be a durable aspect of the domain, and it may be appropriate to factor out a structure (function, class, module, whatever) to deal with it. If there is a big, standards-bending vendor in this market, they may be a durable aspect of the domain, and it may be appropriate to factor out a structure to specialize in that vendor's peculiarities.
Focus on the history of changes and likely stabilities in the domain in order to refactor towards maintainability; you cannot just "follow your nose" and narrow your gaze to the source code.
Yes. If there's an invariant in the domain it should be factored out in the software. But how is that blog-worthy? It's a common wisdom.
What developers often forget about though — and what is therefore worth pointing out — is that after factoring out all the invariants there remains an unfactorable residue, which happens to be pretty big in some unfortunate domains.
This blog posts's claim is that spaghetti code is the correct implementation of a domain with complex, human dynamics.
It supports that claim with an illustrative example, where someone (foolishly) refactors a spaghetti implementation to a structure that does not reflect the complex, human dynamics, and then suffers thereby.
However, the illustrative example only supports the weaker proposition "code structure should reflect the dynamics of how the code changes".
In particular, I think a rule-engine might be more appropriate than spaghetti code for a domain that changes "legalistically". Alternatively, reifying vendors might be appropriate for a domain (such as device drivers?) where large vendors are long-term stable aspects of the domain.
Essentially nothing in the real world is unfactorable. Your example doesn't have a correct architecture ONLY because it is a contrived example that does not have the patterned dynamics that occur in real-world domains - and "doesn't have a correct architecture" is not the same as "spaghetti code is the best available architecture".
The things that evolve are more or less unfactorable: Human anatomy. Body of law. Politics. Telco protocols. The old town of Marrakesh.
It looks all right but then it turns out that foosoft=4 is already being used to identify yet another vendor who uses sequence numbering step of 2. The beautification have actually broken the product!
If that was the case, the original product would have already been broken, since there is no explicit case for foosoft=4 to return old_seq+2, it would return old_seq+1. In any case, it seems that you're looking at the problem in the wrong way. The way you changed the function changes its meaning. You're taking a supposed vendor "id," and using it as something else: a sequence interval. Refactoring is meant to change the code to be more readable without changing its function.
This seems like a poor example.
Ha, you've not actually experienced this have you? The original product doesn't need to be broken. You may just be a small player in a big world. If you make a product that interacts with other products over a common standard and big players make a mistake or a specific non-standard implementation and your customers need your product to work with them, then you have to play ball. Even if it means that the big kahuna introduced a change that broke what was otherwise a working product.
Imagine about 7 years ago, you need to write code to communicate over one of the non-CAN OBD-II protocols. You grab the spec, rent a lot of vehicles, then write your implementation. In the meantime, GM makes a change which breaks your code.
You don't make cars, you make an add-on that's supposed to work with cars. Do you really think you're going to convince your customers that it's GMs implementation that's broken? No. Here's the problem, a few other car companies also have slight tweaks that differ from the standard. Again, you can't just simply finger them as the bad guys. So, you add some spaghetti to your code and make it work.
In case you think this is a contrived example, it's not. I've been the poor engineer in the sad position of writing code to hack around major car companies' actual implementation of something vs what the standard says.
I opine that blaming a domain spaghetti cannot be a sufficient reason to let the code be spaghetti. To give an analogy, just because the ground below is uneven cannot mean that a civil engineer can erect an uneven structure - it would just collapse.
Speaking of chaos in human-defined concepts like law, telecom stuff etc, and the domains you might have worked in, they surely cannot as much chaotic as real life phenomena itself, where we have come of an age trying to employ machine-learning techniques to bring order, but not give in, to the 'Spaghetti' nature and immense chaos of enormous amounts data being churned.
Engineering need to be based on principles and facts and always should be an endeavor to bring order to chaos - if your code that is split into multiple functions is becoming unwieldy, it surely is asking for a major overhauling or perhaps reached a limiting point(You can't keep adding floors to build an edifice of 100 floors with a foundation for 10 floors), in which case, you either stop working on it further or clone new components as Mr. Garrett D'Amore suggests, but can never relapse into having 1500 line functions, which, evidently is unwieldier.
The argument in favor of having 1500 lines of code referring to some 'Spaghetti' domains is apparently myopic and untenable - instructions to computer, either low-level or high-level ought to be organized to be simple and elegant.
I think you are forgetting the definition of spaghetti code.
Let me give you a trivial, but real example.
You write code for a product that interacts with other products using a published standard. Your product costs $. Another manufacturer's product, which almost all your customers use and you need to interoperate, costs $$$.
Other manufacturer makes a breaking change. You don't know when they're going to issue a fix. But, you do know, given a choice over whether to trash other manufacturer or trash your product and use a competitor, customers will simply dump your product.
What do you do?
You can take the high road and say to yourself, "there is no reason for spaghetti code" and go out of business. Or, you can add in a line of code to handle this one off exception and keep trucking.
By definition, that one off line of code (or lines) that exists solely for that one off corner case is spaghetti code. I suppose you could just pretend it's now part of the problem domain and un-spaghetti it.
But since when other people look at it, they will instantly recognize it as spaghetti code, your decision to redefine terminology doesn't mean much.
Post preview:
Close preview