Sonntag, 24. Mai 2015

Rejection Letter For the Paper 'A Falsification of the Aristotelian Theory of the Free Fall and an Alternative Theory'

Rejection Letter For the Paper 'A Falsification of the Aristotelian Theory of the Free Fall and an Alternative Theory'

(a pdf of this text can be found here)

Cynicism is definitivly a problematic thing if being using in academic argumentations. Well, the text below is actually not meant to be cynical. It only tries to describe a non-exceptional situation in the current scientific selection process: the rejection of a paper that is based on measurements. The kind of argumentation in the reviews is not invented. It is an extraction of multiple reviews I got over the last few years. The text gets its cynical flavor because it refers to a work whose result and impact is known to us: the (hypothetical) experiment by Galilei on the free fall.
It should be clear that I do not and will not (not even try to) draw any parallels between my work or any other's work with the work by Galilei.
....have fun reading the text...
Stefan Hanenberg
stefan.hanenberg@gmail.com
version 0.1, Essen, 2015-05-23

Dear Mr. Galilei, 

thank you for submitting the paper 'A Falsification of the Aristotelian Theory of the Free Fall and an Alternative Theory' to the special issue 'Physics and Stuff' of our journal 'Software Technology Usage in Productive Industrial Development'.

Unfortunately, I have to tell you that the paper is rejected based on the common proposal of all reviewers.

The reviews are attached to this notification in order to help you improving your paper and improving your future work.

With kinds regards,The Editor

First Review


Overall merit: 1. Reject 
Reviewer expertise: 1. Expert


Summary


The paper gives a short introduction of the aristotelian understanding of the free fall and discusses potential problems with this theory. Then, the author runs a small controlled experiment where two cannonballs are dropped from some rather peculiar tower. From the experiment's measurements the author concludes that the aristotelean theory of the free fall must be wrong. In a second experiment, the author dropped even more cannonballs with different weights and proposes (based on the measurements) a different relatively trivial model that is from the author's perspective a better theory of the free fall.

Review

I am quite positive about empirical studies and results in general. Still, I cannot hide that I am not convinced at all about the here proposed experiment and the conclusions drawn by the author.

First, the description of the aristotilean theory and its relevance is hardly described. In fact, the author just says that it is a theory that exists since some centuries and plays a major role in our current understanding of the real world. However, it is unclear why this theory should be relevant at all, or why a change in the theory should be necessary (taking into account that this theory exists since centuries!). Hence, neither the background of the theory nor the possible impact of changes to that theory are clearly described which makes it hard for readers to understand why it should be even interesting to read this paper et al.

Second, the experiment is far from being convincing for a large number of reasons.

1) The author uses cannonballs of different weights and drops them from a tower. As the author mentions the tower is not an ordinary one but seems to have some peculiarities (why do such towers even exist in Italy?). As a consequence, it is quite plausible that the experiment results (whatever they are) are highly influenced by the choice of the tower itself and not by the theory being tested. Hence, the reviewer urgently asks the author to rerun a different experiment with a different tower. 

2) Next, the author choses cannonballs. As a consequence, he completely ignores that cannonballs serve a special purpose. Cannonballs are explicitely designed in a way that their behavior with respect to being shot or being dropped is very similar - no matter what their weight is. Such statements by cannonball designers are even completely ignored in the related work section. Definitively, well known papers such as "Why I think writing software for cannonball designers is a better option than having no job at all" by Fubar et al. '63 which is a fundamental and ground-breaking paper for the whole cannonball industry must be mentioned. Hence, it is clear that just because of the used subjects in the experiment it was not even possible to show anything else than just similar results - the experiment does not falsify any theory but just gives another indicator for the maturity of our cannonball industry.

3) The measurement process is not described. While the author mentions that the cannonballs had different weights (in the first experiment 1 kg vs. 10 kg) it is completely unclear how the measurements are performed - the precision of the measurements are completely unclear and the author does not discuss how the measurements were performed. Even worse, the author just describes the tower in terms of its height (without mentioning any other measurement) and - again the 56 meters are given without describing any other measurement. Taking into account that the tower seems to have some problems with its fundament, the height measurement is obviously not enough to describe the essential parts of the experiment. The time measurement is hardly described: the author just describes on page 3 that the time measurement started from the moment when a cannonball was dropped until the cannonball hit the floor and that the author used some clock that shows even miliseconds*10. We need to take into account that even the definition of a second changed over time (mean solar day second vs. period of the Earth's orbit around the Sun vs. atomic clocks). From that we can conclude that none of the time measurements is trustworthy -- as the author should have noticed, there are small deviations between the different measurements. This shows clearly that the measurements are not trustworthy at all. Hence, we must not conclude anything from the measurements.

4) The sample size is much to small for any serious study. Taking into account that for the first experiment - only 2 kinds of cannonballs have been used (again, all measurements were performed from the same tower) only 10 time measurements were collected it is completely impossible to generalize from the measurements to anything else. However, such kind of generalization is done in the second experiment. In the second experiment the author concludes from 10 different - yes, again cannonballs - and 20 different heights - again without giving a precise description of the measurements - that all measurements can be described with a formula t(h)=sqrt((2*h)/9.8). Again, such a formula cannot be derived from the measurements (for the reasons explained below).

5) The resulting formula appears completely unmotivated -- where does it come from and why should the free fall described in that way?

6) The analysis of the experiment is somehow obsure, especially when taking into account that a precise number is the result of the formula: not a single measurement matches the expected result from the formula! The author should have noticed that not a single measurement fulfills the formula. Instead of mentioning this obvious thing, the author tries to rescue his experiment with some statistical tests. The applied test (some so-called significant test) is not explained in detail and the rather mysterious results of this test are not explained. The author should have noticed that even an obvious test (the arithmetic mean of the results) differs from the formula's results.

7) The external validity of the experiment is in fact zero. Only cannonballs have been dropped in order to argue that the aristotelian theory does not work. I strongly advice to repeat this experiment with multiple other objects to be dropped from the tower. It seems obvious to drop additional things such as water, sand, or even complete ships in order to increase the experiment's external validity.

8) The related work section is far from being complete. Again, fundamental works about cannonball constructions are missing, no work is mentioned about the used tower, and not even different works on time measurements have been cited.

Hence, I conclude from the review above that the paper has to be rejected: the motivation is unclear, its relevance is unclear, the measurements are unclear and as a consequence, no conclusions must be drawn from these measurements. Additionally, the external validity of the experiment is not given. Although the proposed alternative model for the free fall is interesting the paper does not give any valid trust in the validity of the model.

Second Review


Overall merit: 1. Reject
Reviewer expertise: 1. Expert

Summary


The paper describes what happens when cannonballs are dropped from a tower. The author performs a number of measurements and states that these measurements conflict with an older theory of the free fall. Finally, the author describes his personal formula for the free fall which does not seem to be consistent with the older theory.

Review

While the general idea of the paper is interesting and the author's conclusions are quite innovative, the paper has a restricted perspective: the whole argumentation is based on quantitative measurements. Probably the most important information is missing in the paper: The design process of the experiment is completely unclear. Why was this extraordinary tower used for the experiment? Why were cannonballs used? Why was the measurement based on time? What were the reasons to come up with the final formula proposed in the paper? How can the formula be explained?

According to this, the general comment to the paper is that the paper lacks of any qualitative analysis that is relevant to the studied topic.

a) What additional observations were made in addition to time, height and weight? The pure use of quantitative data is a too restricted perspective on any aspect of daily life. The chosen cannonballs are only explained in terms of weight: it is clear that additional characteristics of cannonball are essential, too (What are they made off? Who was the producer? Has their functionality been tested before? How can the surface of the cannonballs be described? Were they comparable?). With respect to height, it is unclear how the tower can be described best. The author mentions (in addition to its height) only one special characteristic of the tower in one single sentence. All other aspects of the tower are completely ignored. With respect to the time measurement, it is unclear why the authors tried to measure time in such a complex way and not only asked people whether they saw differences in the free falls of the cannonballs. Additionally, the author not even tries to describe the different ways how the cannonballs fell down (although the measurements do show differences!). Hence, the most essential information -- the different ways how the cannonballs fell down -- are missing in the paper. Because of the resulting missing qualitative analysis it is not possible to find explanations for the differences in the way how cannonballs fall down from a tower.

b) The chosen experimental design reveals some obvious weaknesses. In the first experiment, the author uses two cannonballs in multiple measurements. Hence, the indiviual influences of a single cannonball is very high and it cannot be expected that the resulting measurements imply anything meaningful: it is well-known that AB experiments have the problem of unbalanced groups and this effect is even stronger in the here proposed experiment because of the use of the same cannonballs. In order to get rid of the problem, it is more desirable to have multiple different things to be dropped from the tower. Again, qualitative studies could help in order to find out what kind of different things could be dropped from the tower. An additional qualitative study could show what additional items could have been carried on top of the tower.

c) Because of the missing qualitative data, it is impossible to replicate the experiment. In case someone wants to replicate the experiment, it is necessary to understand what kind of tower could be used in the experiment and what kind of cannonballs could be dropped. Because of the special characteristics of the tower it seems even impossible to replicate the experiment, because it ia rather unlikely that such towers can be found somewhere else in the western hemisphere.

While I think that the provided quantitative data has some value it is still necessary to provide additional qualitative data and to run an additional qualitative analysis.

Minor comment:

The paper needs proofreading by a native english speaker.

Third Review

Overall merit: 1. Reject
Reviewer expertise: 1. Expert

Summary: 

The author describes an experiment that tries to falsify an existing theory: the aristotelian theory of the free fall. Based on two experiments, the author comes to the conclusion that the theory must be wrong and the author proposes an alternative theory.

Review:

The starting point of the paper is quite unusual. While most of the works that can be found today in physics address real world problems such as alternative sources for energy or the construction of large machineries somewhere in Europe (Switzerland), the paper addresses a more basic topic: the free fall. But it is unclear how this could be able to provide any usable or useful insights that could be applied today. Hence, the relevance of the work is unclear.

The line of reasoning in the paper is problematic. The authors try to falsify a theory (free fall) by some measurements in an AB experiment that actually do not differ: the author does not measure a difference between group A and group B. However, the literature explicitely says that measuring no differences is no indicator that there are no differences. It could only means that the experiment is problematic. Hence, the p-value of .99999 cannot be interpreted as no difference between A and B. The same is true for the second experiment where multiple measurements are compared (minor comment: the author forgot that a correction is needed because of the cummulated alpha error).

The author ignores completely that the theory of the free fall is no longer relevant - take into account that our industry is in the meantime able to construct things such as airplanes. Even more plausible comparisons are not being done. For example, birds do land with different speeds which directly contradicts almost everything that can be found in the paper. Hence, the proposed theory is not only a pure artificial one, it is even possible to find direct contradictions with it in reality (landing birds). Hence, the general idea of falsifying the aristotelian theory is not only irrelevant, it is wrong.

Minor comment: The author should have noticed that the proposed theory reveals results that are not measurable with the clock he used -- nor with any other clock that exists today (because all of them are not precise enough). As a consequence, it is clear that any arbitrary measurement inherently falsifies the proposed theory.


Donnerstag, 13. November 2014

10 Feet Steel Plate (Parabel on Software Construction)

->This document is originally a pdf <-

This is the a story about “situations in software construction” that the author frequently uses to argue about problems in software construction. The intention is to show by an analogy that the discipline of software construction has serious problems. Although analogies are often inappropriate to argue for or against something, the author has the personal feeling that for software construction the partially absurd and frustrating situation becomes even more obvious with the aid of analogies. 
Stefan Hanenberg
University of Duisburg-Essen, Germany
stefan.hanenberg@uni-due.de
version 0.1, Essen, 10/11/2014

The 10 Feet Steel Plate 

1 The Story

Vince wants to build a house. After speaking with his bank he is convinced that he has the financial resources to build a house. However, Vince is not a house builder, so it is hardly astonishing (after he was thinking for a while whether he should still try build the house on his own **1) that he asks an architect for help. Vince never built a house before. Hence, he has no experience in what can go wrong and how expensive all his different wishes for the house are. Vince feels slightly unsecure, because he is not able to judge whether the advice given by the architect is meaningful (because of Vince's absense of civil engineering capabilities).

The meeting with the architect works quite well. They speak about the number of rooms Vince requires and speak about the budget for the house. They speak about the number of bathrooms, the number of floors, the size of the living rooms and the size of the kitchen. They speak about the kind of stone that will be mainly used and the construction of the roof. Vince is absolutely aware that the more he wants, the more he finally has to pay and because of that he asks all the time the architect how expensive every single decision will be.

After a while (and Vince already has some faith in the architect) the situation changes.

Architect: "Ok, it looks like we have clearified almost everything. I just would like to make one final proposal that you might take into account. I was thinking that we could install a 10 feet steel plate between the first and the second floor."

Vince is really surprised. He has never seen a house with such a steel plate before. However, since everything the architect said so far appeared meaningful to him, he is curious about the proposal.

Vince: "Tell me more about it. What exactly would be the benefit?"

Architect: "Well, in case you want to increase the number of floors in some years, this steel plate probably allows it."

Vince feels even more insecure. He never thought he could have the desire to increase the number of floors in some years. And he has actually never seen a house in the neighbourhood where additional floors were added after the initial construction. However, the architect might be right. Maybe in some years he might be interested in that and if he does not take care for this upfront, it might become more expensive later on.

Vince: "Ok, I am really not sure whether I need additional floors somewhere in the future. But just in case: How expensive is it to construct the house with the steel plate?"

Architect: "I do not know."

Silence. This was definitively not the answer Vince expected.

Vince: "Well, but could you guess, how much it might be?"

Architect: "Seriously? No. I have never seen a house with such a plate. Even if I might be able to determine how expensive the plate is, I have really no idea how much is required to install it."

Vince: "Ok. Maybe we can clearify this later. But tell me, how many floors will be later possible with this ten feet steel plate?"

Architect: "I really don't know. But it is plausible that at least some floors will be possible in case the fundament is able to carry the weight of the plate and the additional floors."

Vince: "But the fundament will be able to carry the steel plate plus one floor, correct? I mean, you proposed to install this plate between the first and the second floor."

Architect: "In fact, I do not know yet. It might be possible that a completely different fundament is required. And it is also possible that we have to make a number of changes to the first floor so that the steel plate does not simply crush it."

We do not know how the story went on, but we do know that Vince is living since years in his own house that was designed by a different architect. And although not a single house in town has a 10 feet steel plate installed, there are continous rumours that there are such houses somewhere else.




What follows is my interpretation. Of course, there could be different ones, and I assume there are people out there who not only think that the story cannot be applied to software construction, but that the story is rather a good example why there are no or hardly any problems in software construction. I would love to read such interpretations and explicitly welcome any comments or additional interpretations.

2 Interpretation and Discussion

Would anyone of us take the architect's advice serious? Would anyone take the idea serious enough to be even discussed? Whom of us would just have bumped out the architect just because of even articulating such crazy ideas?

The interesting part of the story is, that probably all of us agree that the architect's proposal is completely stupid. We would argue that just having a new idea --- the steel plate between the first and second floor --- does not imply that the idea is meaningful. Nobody would blame Vince for bumping out the architect.

2.1 From House Construction to Software Construction

As soon as we switch into the role of software engineers, our judgement of the situation changes. Let's replace some words in the previous story. Instead of building a house, we speak about building a piece of software. Instead of the 10 feet steel plate, we speak about the new technique on the market that appeared just recently and that should become one of the central parts of the new software.

Such a new technique has different kinds and forms. It might be the new programming language that just appeared, comparable to the situation in the 90s when Java appeared, comparable to the situation some years later when PHP appeared, or comparable to the more recent revival of the approximately 20 years old programming language JavaScript. In the 90s, it could have been the new document format such as XML or more recently the format JSON. It might be the new IDE, the new API, the new framework, the new code generator, the new markup language, the new middleware, the new architecture, etc. In the 70s it could have been a newly released relational database system or thirty years later one of the NoSQL database systems.

What all these techniques have in common is that they suddenly appear on the market, cause some interest among developers or managers*1, and are taken into account for real software production (comparable to the house that is about to be built). This is similar to the architect's spontaneous idea. However, it still feels like the story (and the story's transfer to software construction) is absurd.

2.2 Is the Analogy Absurd?

One direct reaction on the analogy is, that the idea of the 10 feet steel plate is obviously totally crazy while the new software construction technique is not -- the analogy is absurd. No single house was ever built with such a plate. Hence, it is clear that such a house should not be built.

Well, actually no single software project was built with the new programming language either, which does not seem to imply for software engineers that they should not be the first to apply it. The argument “there is no single example for such a house” that seems plausible in house construction and that argues against the steel plate does not seem to work in software construction -- it looks like the analogy reveals something about the very different lines of reasoning in house construction and software construction.

We as software engineers have the tendency to say that the argumentation does not hold, because our knowledge on house construction is sufficient to determine that the plate does not make any sense while our knowledge, experience and intuition in software construction rather advices us to take one of the new software construction techniques into account. This issue requires a much deeper discussion of the relationships between history, experience and knowledge. This will be done later in section 2.11.

Another direct reaction of software engineers is that they say they are not naïve and do not blindly apply new techniques just for the sake of applying something new. But this is what the architect is doing by proposing an untested technique for being used in a final product. Instead, software engineers try out new techniques very carefully before considering them appropriate for being used in the construction of productive software systems.

2.3 No Naïve Application of New Technologies?

Let's assume for a second that Vince is our close friend and it really looks like he takes the installation of the 10 feet steel plate into account. We would probably try to argue out Vince of even thinking about the plate. We would do that, because we are not only afraid this house could financially ruin our friend; we are also afraid that the plate could do some serious harm to our friend and his family when the plate falls on their heads.

Maybe Vince is such an adventure seeker that he ignores our concerns. Probably we would ask him to build some kind of small model first. We do that because we want to convince him that the general idea is crazy. But even when the first model holds, we would probably ask him to build some small house first with such a plate -- maybe in the end we would advice him to build the house he desires with the plate but not to move into it, but to see first whether the steel plate actually does any harm to the house. We do that because we are not aware of any book titled “Why 10 feet steel plates cannot be used in house construction” and we are not aware of any building we could just show our friend. This implies that we cannot give a direct reference or resource that reflects the craziness of the 10 feet steel plate. Our advice does not mean that we take the steel plate seriously into account (again, based on the knowledge argument, see section 2.11). We do that because we want the test to fail: We want to prevent Victor from doing a serious fault.**3

When we now consider a typical argumentation in software construction, we see similar approaches but with completely different intentions. As said before, software engineers often argue that they do not blindly apply new techniques. Instead, they first try out new techniques in a smaller context. They do that because they want to find out whether the new technique could work. When they are convinced that the technique could work, they apply it maybe in another, maybe larger context. When it works there, too, they take the technique into account to be used for the development of productive systems.

Again, it is interesting to compare this argument with the (hypothetic) advice to our friend. We asked him to apply the technique, because we want him to see it fail. In case the test does not fail, we ask him to apply another test. We use the idea of testing to show the invalidity of the new idea. Even if one of the tests in house construction does not fail, we do not think that the appoach could work in actual house building. This is because we are aware that the reality is much more complex and cannot be simply immitated by a simple model. And we know that one, two or three simple models, i.e. try outs under non-controlled conditions, do not permit any serious and stable insights that could be directly applied to reality.

In software construction the idea of testing is used to show the validity of a new idea. Talking to software engineers reveals that their testing of new techniques is not a massive and critical analysis of a new technique. Instead, it is rather some test of plausibility whether the technique could work.**4 Hence, our advice that is directed against the installation of the 10 feet steel plate is quasi-inverted in the domain of software construction.

2.4 The Need for New Ideas

The reason for this inversion of argument is the difference between house construction and software construction with respect to how new techniques are considered. While in house construction new ideas are considered rather conservative, the software industry is rather open for new ideas. In house construction, new (and potentially innovative) ideas are only applied if the resulting risk is rather low. This might be because the consequences of failing in house construction are terrible for the builder (who probably does not have financial resources to build just another house) while the consequences in software construction are...well, we do not know, because we would need to know first, how large the software project is with respect to the company's resources.

However, software companies also often emphasize the need for new and innovative ideas, otherwise no helpful techniques would be applied at all. And not using new techniques has two bad consequences. First, the company becomes old fashioned. If it becomes known on the market that the company applies only old and conservative techniques, the company becomes uninteresting for potential new clients. The other point is that such a company becomes uninteresting for new potential employees, too. And attracting potential employees is necessary in order to get new developers.

Both arguments might be right, but interestingly, this falls rather in the domain of marketing: software companies assume that applying non-new techniques (and ignoring in that way any bleeding-edge technologies) would lead to some kind of negative reputation that negatively impacts the company. This implies the software market somehow desires bleeding-edge technologies -- which is an interesting argument, because it is at least not obvious why the software market should work different than other markets. In order words, it looks like a statement such as “building houses for a predictable price” that works for house building does not work in software construction. This corresponds to an often heard argument by the software industry, which says that they finally do apply new and innovative techniques in order to reduce costs.

2.5 Cost Reduction by New Technology?

When speaking about costs, there seems to be a direct mismatch between the 10 feet steel plate and the application of the new software construction technique. The 10 feet steel plate causes incredible costs: even without knowing any details about house construction is seems clear that the steel plate itself will cause most of the costs for the house. At the same time, it is rather unbelievable that a new software construction technique could cause such costs.

Having said this, it is worth to think about some common argument heard from the modeling community who states that before a piece of software is actually built, it should be modeled first. Maybe this argument is right (maybe not), but it is obvious that it causes initially additional initial costs, because the development of the actual software starts much later because of the additional modeling step (that might turn out valuable later on).

While for the 10 feet steel plate some costs are obvious (at least for buying the plate; here, the steel price could be used for a first approximation), other costs were completly unknown: since such a plate was never installed into a house, it is unclear what additional steps need to be done. Things to be answered are how the plate would be laid on the first floor, what kind of machines might be necessary for that, how many workers might be necessary, etc. The (probable) changes to the first floor and the (probable) changes to the fundament also need to be considered.

For software construction techniques some costs are directly visible, because money needs to be transferred to someone else (who could be the producer of the new application server software). These costs could be hidden as well (because getting the new technique under control might require some man months). Some of these costs might not appear directly but somewhere in the future (when the only person who was nearly able to handle the technique left the company and it turns out that additional people require some training before doing any actual work) -- hidden costs that might be also valid for the house with the 10 feet steel plate (in case it turns out later, that the fundament finally gets damaged).

The interesting thing about the new applied software construction technique is that there is typically relatively few knowledge about it, especially when we speak about the potential costs they cause. Of course, each of these techniques come with a number of claims and promises, comparable to the additional new floors that can be installed on top of the plate. The additional new floors seem from first glance directed to the idea of scalability --- each of us knows a story about some technique in software construction that promises that the application of it makes the software more scalable. But in fact, these new floors are just an arbitrary promise. Statements such as “makes the software more readable or understandable” matches the “additional floors” as well as “increases flexibility or reduces maintenance costs”.

In fact, we do not know to what extent certain choices in software construction techniques cause what costs.**5 It is unclear whether the application of a new programming language will in the end cause a dramatic increase of costs or whether the applied framework is in the end responsible that our software project completely fails. Although we have the tendency to think that we are sure that the 10 feet steel plate costs too much, we are only less sure what the actual costs of the new software construction techniques are.

At that point, we find another objection: the objection that we are more sure that the new software technique will cost less than the steel plate, because there is actually some software market where the new software construction techniques are advertised while there is no such market for the 10 feet steel plate.

2.6 Market as an Indicator for Applicable Technology?

One obvious difference between the 10 feet steel plate and the new software construction technique is that the steel plate appears as the architect's spontaneous, crazy idea. This idea does not seem to have any background or foundation (at least the architect has not spoken about it). The 10 feet steel plate has not yet produced. There is no market where the installation of 10 feet steel plates is advertised to be used in house construction. There is no producer of 10 feet steel plates who creates such plates for the purpose of being used in house construction. The idea of the plate comes out of the blue.

The situation is different with new software construction techniques. As long as we do not speak about things such as process models, at least programming languages, APIs, etc. are things that have been produced already (before taking into account that they should be applied). There were people involved, time spent, investments being done in order to create this artifact. There are webpages where the artifacts can be downloaded or shops that sell these artifacts.

At least the existence of some market gives some trust that the provided products are more than just spontaneous, crazy ideas.

However, it is not clear whether this first impression is the same after a closer look. Maybe the architect has some friend who owns a company that actually is able to produce such a plate. Maybe this friend was so convinced that the architect will be successfully advertising the plate that he already produced some. And finally, we need to take into account that the architect was quite honest in the meeting with Vince: Instead of massively advertising the plate, he honestly told him, that there is hardly any knowledge about it. From that perspective, the presense of the market is not an indicator for the validity of using 10 feet steel plates in house construction. It is rather a question whether some company somewhere thinks that the plate could be sold -- although the plate might turn out to be not only useless but even harmfull. Applying this thought to the software market, there is no reason why we should not think about it in the same way. That someone produced some new technique is just an indicator that someone thinks he can sell it. It is no indicator for the usefulness of the technique -- nor whether the technique is usable at all.

2.7 Salesmen, Innovation and Responsibilities

The problem of speaking about the market and potential innovation is that the architect acts, from the perspective of a professional salesman, unprofessional. Instead of praising the new innovative product, he is honest to the client -- no wonder that he finally neither sells the steel plate nor any house. From the company's perspective (in case the architect comes from a compary) the architects behaviour is harmful.

However, from the client's perspective the architect's behavior is appropriate. His initial consulting services were satisfactory and he finally made a new proposal and gave an honest estimation of the usefulness of the new technique. Based on the architects estimation the client was able to determine on his own whether or not he is willing to take the risk and apply the technique. The only problem left was that the architect's proposal was from the client's perspective so unbelievable that he completely lost his faith in the architect.

From a software developer's perspective the situation is considered differently. Because of the architect's honesty it will not be possible to actually try out whether the new technique could be applied, i.e. whether such a house could be built. And after that, we will not soonly be able to find out whether it is possible to increase the number of floors on top of it. The software developer rather considers the situation as a missed opportunity -- and confuses something. First, serious house builders would not just play with a new technique and try it out for a new house (that is actually habited by people). Even if one or two of the most trivial tests succeed, house builders would still not simply apply a new technique, because they are aware that some simple tests hardly say anything about the validity of the new technique. House builders are aware that many different factors determine whether a new technique is appropriate -- and that some simple tests cannot replace serious studies about the appropriateness. It looks like software engineers have a completely different perspective on this risks. This might have to do that hous builders are aware that in the end someone has to pay for a failed test.

The other fact is, that the existence of something new does not necessarily imply some opportunity -- and does not imply innovation.

Doing something new without knowing the potential risks and without knowing how large the potential harm is, is irresponsible. The architect acts irresponsible because he does not keep away the potential harm from our friend. Proposing a certain technique (while being aware that the technique cannot be applied or at least not without a very high risk and very high costs) potentially causes harm. If the architect would have been succefully advertising the steel plate, it probably would have been Vince's ruin (and could have been worse in case the steel plate would in the end fall on our friend's head). In the end, taking a risk is not for free -- but people need to be aware of how large a risk is in order to decide whether or not they are willing to take it. Here, the unfortunate situation is, that most of the risks in software construction are yet unknown. Software developers have the tendency to say that as long as these risks are unknown, they can be ignored. Other disciplines (such as house construction) have rather the tendency to say that an unknown risk must not be taken.

2.8 But Finally, it Works!

Putting the previous concerns aside, software engineers often argue that finally they are able to deliver a running piece of software with the new software construction technique while the architect is not able to deliver a house. Hence, finally it works! The right analogy would have been that finally the house would have been built including the 10 feet steel plate.

Unfortunately, this often heard argument has two problems. First, it is untrue with highly probability. And second, in case it is true, it has to do with something software engineers typically don't like to speak about: budget.

There is the often found statement that software projects have the tendency to fail quite often. However, although this statement seems to belong to the general knowledge of software engineers, it is still unclear to what extent this statement is true. Mostly, people refer to the CHAOS report that reports that between 15% and 40% of software projects fail. However, there are people that argue (for very good reasons) that the numbers of the CHAOS report should not be taken too seriously, because it is unclear where they come from. However, there are other sources available that report a cancellation rate between 10% and 15%**6. Let's be positive with the software developers and let's just assume that 10% cancellation rate is correct. Again, let's go back to house construction. Would a serious house builder accept a 10% cancellation rate? No way!

But even if we would accept the wrong argument “Finally, it works”, we still need to ask, what the budget constraints were. Again, if there are no budget or time constraints at all, there is no risk at all, because a project can go on and on forever.

Again, it is possible to build the house with the 10 feet steel plate. The only problem is, that in the end it is getting expensive.

2.9 But Software is Unique!

Coming back to the cancellation rates of software projects, developers typically argue in another very specific way: Software is unique and therefore it cannot be compared to anything such as the construction of a simple house. The problem with the analogy is not, that we build a piece of standard software (the house) with one single new element (the 10 feet steel plate). Each piece of software is an indivual act of creativity of software engineers and the requirements for a new piece of software are so unique that they cannot be compared to anything else. The cancellation rates do not express that people have troubles making software because of the application of unknown techniques. The cancellation rates express that a lot of software is so unique that it is hard to predict whether the desired piece of software is doable at all.

This argument is interesting, because it says a lot about how software developers think about themselves. It reflects the perspective of a young and wild discipline where its members frequently address problems that appear unsolvable and that are finally solved as the result of hard work. And in case the problems are not solved, it is the result of unsolvable requirements.

However, when we finally take a look into the software market, we see a lot of investments into (and actually built) software that does not appear that completely new. We see web applications where people register themselves and where personalized information are shown, games on mobile devices that do not seem to be much different to games we have seen before, or migrations of existing APIs in other programming languages. Yes, from time to time we see new and innovative products and techniques. But this is what happens in house building as well.

At least, it would be interesting to think about whether the high cancellation rates are an indicator that software construction spends too much of the time on techniques that cause later on (when the product has not yet been delivered) serious troubles. In order to apply the 10 feet steel plate analogy again -- it is valid to question whether the high cancellation rates are more an indicator for how often equivalents of 10 feet steel plates are tried out in software construction instead of using the cancellation rates as an argument for the uniqueness of software construction.

2.10 Software is Much More Complex Than Anything Else!

This cancellation argument is often used not only to argue for the uniqueness of software. It is also often used to argue that software construction is just much more complex than anything else. Already in simple programs there are a large number of threads, communication processes, events, computations, etc. And a single error somewhere could already break the whole software. Just because software developers can make a large number of errors it makes their life much harder than the architect's life who just needs to compose four walls, a door, some windows and a roof. A side effect of this argument is the implication that house construction is rather a simple task.**7

Again, this argument contains some naivity of software engineers with respect to other disciplines. For unknown reasons software engineers do not only have the tendency to declare their own discipline as rocket science, they have also the tendency to consider other disciplines as trivial. As described before, a large amount of software we find on the market is not so new and completely different than any other piece of software that already existed. This does not directly imply that software is not complex, but it implies that at least a larger number of people were able to cope with software's complexity.

Both previous arguments (cancellation because of software's uniqueness, cancellation because of software's complexity) have the same tendency to invert a given argument -- something that already happened in section 2.3: In section 2.3 it was discussed that simple, successful tests in software construction are considered as a proof for the validity of a new technique and the legitimation for the application for such techniques. In other desciplines such simple tests are not more than small indicators for new techniques -- without any implication about its applicability. Here, we find that a high number of cancellations of software projects should be used as an indicator for the complexity of software construction and the uniqueness of software construction. Other disciplines would argue about this as an indicator for immaturity.

However, it is time to come back to the very first argument: the statement that the discipline of civil engineering and software construction cannot be compared because of the differences in experience and knowledge in both disciplines.

2.11 What about Experience and Knowledge?

A already mentioned in section 2.2 there is another argument against this analogy that should be discussed here. While there is a very long history of house construction and a long history of knowledge from civil engineering, these statements do not hold for software construction. Software construction is still a very young discipline. Our knowledge cannot be compared to civil engineering at all.

This reaction is be not completely wrong: obviously, houses are constructed since centuries while software is being constructed only since decades. As a consequence, we would directly think that the knowledge in both disciplines is different. But in fact, there are two different facets used in the same argument. The first one is related to age in terms of years (which probably somehow correlates to experience) and the second one is related to knowledge. Discussing both issues is not trival and requires some space.

2.11.1 Age and Experience

With respect to age and experience, we need to agree that software construction is quite young -- let's say approximately 60 years. However, with respect to experience, we just have to take a look at our mobile devises, to ask our bank's webpage about our savings, or to start our car (and getting directly feedback from some automated procedures about the car's condition and getting a bluetooth connection with our mobile that directly starts acting as a navigation system). A lot of software has been already constructed. While it is true that software construction is relatively young, it is at least not obvious whether there is not already “a lot of” experience in sofware construction: At least it is not directly obvious whether this experience is “less” than the experience in house construction.

The problem comes directly from the word experience, which is often used in different meanings. Often, a phrase such as “experienced craftsman” means that someone has already done a lot of work in a certain domain. We assume that such a person is aware of lots of problems in his domain and has found (or was taught in) means to solve these problems. This means such a person is able to detect a problem, i.e. in a given situation he is able to see the similarity to something he went through in the past. Solving a problem means to apply some tricks that helped before and that quite often actually do help. In summary, we assume that such a person went through some learning curve over some time.

Having said this, we are aware that it is not necessarily the case that a person who spent many years in a certain domain actually has much experience. Experience implies “having gone through different situations and problems and solved a number of them”. It is possible that a person has spent many years in a given domain although he has not gathered much experience. On the other hand, it is also possible that a lot of experience is gathered within a relatively short time frame.

Here, again a number of people will complain that the analogy of the experienced craftsman does not hold, because the way craftsmen are trained is to a certain extent the transfer of experience -- since this happened over centuries, this cannot be happen in software construction. Well, again, this argument is maybe partially true. But continuing the discussion here would take too far away from the original question. It is only intended here to say that “age and experience” are two different things. And it should be emphasized that “long time” does not imply “much experience” while “short time” does not imply “hardly any experience”. It should be only mentioned that it is at least not directly obvious whether experience in software construction is “less” than the experience in house construction.

2.11.2 Experience and Observation

Although the previous argumentation seems somehow plausible, there is one thing missing. As argued before, observations (i.e. the craftsman who sees a problem) are an essential part of experience -- and observations are finally the connecting part between experience and knowledge. However, the way how observations can be made is quite different.

When we see a house, most of us see a difference between a ruin a a newly built house (although we are wrong from time to time). When a house collapses (and it is for everyone in the neighbourhood observable when a house collapses) we imply that there was something wrong with the house. When we see a number of similar houses collapse, we assume that this is related to the commonalities between those houses. Such observations (which might be wrong) and theories (the commonalities between the houses that cause the collapse, which can be wrong as well) can be done by everyone who just understands the concept of a house: Such observations can be done by non-experts.

In addition to those observations by non-experts, there is an incredible amount of observations possible by experts. Even if something completely new is tried out in house building, experts have been trained in doing certain observations in order to check whether there is something seriously wrong. They would be able to detect much earlier than non-experts whether the house's fundament is in trouble. Or whether the walls begin to have problems to carry the weight of the roof, etc. Just to make sure: so far, this text speaks about observations and not about applied knowledge. I do not mean here that the civil engineer applies his knowledge about statics in order to compute whether the house is doomed to collapse. He just observes certain phenomena (such as cracks in the wall) where his experience tells him that his might result into serious problems.

If we compare this to software construction, a simple observation such as “a house collapses” is not that easy. From time to time our webserver is not available. Or there are frequent problems with our WiFi. From time to time, we need to restart certain programs. But it is very often the case that it is not possible to connect the observation directly with a certain software or a certain characteristic of software. For a non-expert it is relatively hard to make observations that are directly related to the software. Again, it would be possible to argue here against this. Windows-users knowing the blue screen do observe that something went seriously wrong which seems comparable to “observing that a house collapsed”. However, in that case it is only the environment that crashed which might have something to do with the software or not. Or Linux users are able to observe that a program stopped with a segmentation fault -- an observation that is (somehow) directly related to the software itself.

The software examples above might or might not be applicable but there is something wrong with the analogy. Even of we see a piece of software crashing from time to time (and if we do think for some reason that this is related to the software itself), it does not help us to judge what part of the software is responsible for that, because software is hidden behind a user interface. We do not see directly the different parts the software consists of. Hence, we cannot judge what parts of the software are problematic. Experienced users who frequently use a certain software are often familiar with problems it causes. They know that certain interactions with the software (long tables, certain input frequences, certain functions) lead to troubles and therefore do not do those interactions. In fact these are observations. But these observations (again) do not permit to identify those elements that actually cause the error. A spontenous reaction here is to say that for example the usage of tables in a text processing application is the element in the program that is problematic. But, we do not know whether the table implementation is a modular unit and we cannot conclude what part of the software needs improvement in order to fix the problem. Compared to the house with the 10 feet steel plate (under the assumption that the house has been actually built), we might see that jumping around in the second floor increases the cracks in the fundament or the walls in the first floor. We stop jumping around (comparable to stop using tables in our text processor), but we cannot conclude from it what might be the reason that actually causes the cracks.

When we focus more on the general idea of the discussion (the relationship between experience and observations), the main argument above was that experience depends on observations. But as argued before, we have troubles doing observations in software construction. While it is hardly possible to do such observations as non-experts, it is even hard for experts to do such observations. While some characteristics of a software are observable (the size of the software in terms of hard disk memory or memory consumption at runtime**8) it is unclear what can be concluded from these observations or what additional observations are possible or necessary. For example, even if we know that a piece of software requires 10 mb space on disk or 50 mb space at runtime, it is unclear what exactly this says about the software and its ingredients. While it is obvious that the civil engineer is able to detect a crack in the wall, it is unclear what comparable observations could be done for software.

One could argue that software is more a mystery than something else. However, the statement here is that we have not learned so far what could be observed. And as a consequence, our knowledge on software construction is quite limited.

2.11.3 Observations and Knowledge

In addition to experience there is the concept of knowledge -- again, both concepts are not directly related to each other. It is possible to extend knowledge by reading a book while most of us would not assume that reading a book increases experience (well, at least no as long as we do not do some additional exercises that are possibly proposed in the book). At the same time, it is possible to increase experience without increasing knowledge. Sport might be a good example for that. An amazing football or soccer player might be able to do great things on the field; because of hard training the player is doing the right things at the right time. He does not need to think about how he currently moves in order to get some advantage on the field -- he just does the right things by intuition. This does not imply that he is aware of any of those things he is doing, nor does he needs to be aware of any causal relationships between actions such as catching a ball and feinting a movement at the right time in order to get some benefit.

Yes, in house construction a lot of experience has been gathered over the centuries. But what's much more important is that over the centuries a lot of knowledge has been gathered. The origin of this knowledge might have been experience in the beginning based on observations. But knowledge is more. Instead of doing the right things by intuition, knowledge gives us some a conceptual framework to understand a situation and to judge whether a certain action would work in such a situation.

When we think about the 10 feet steel plate we do not only think after the house has been built that the plate is responsible for serious harm. We think before building the house that it is a bad idea to install the 10 feet steel plate. Every software engineer has already made some experiences with certain APIs or architectures that are badly implemented. In these cases they might refuse to apply the same technique in later projects. This is comparable to a civil engineer who actually has installed such a plate once and came to the decision not to apply the technique again**9. But (again) the difference is that we know upfront that the steel plate is a bad idea.

Civil engineering has gathered a lot of knowledge over the last centuries. The different characteristics of stones and woods being used in house constructions have been studied (in addition to millions of other things) by experts and an amazing theory has been created as a result of these observations. All of these observations depent on the ability to do observations. When we want to study whether a certain stone is applicable for house construction, we need to study what weight the stone is able to carry, to what extent the stone can be damaged by water or wheather, etc. And all of these studies depend on the observation when a stone gets cracks**10. And civil engineers are aware that this knowledge is not the result of a singular test. When a stone needs to be tested (and people are aware that two natural stones are not directly identical), this is done by a larger number of tests, each with a large sample size. This implies that as long as the idea of cracks had not be found, none of these studied was possible. And without any of these studies, no general theory of house construction would have been possible: The maths, that expresses these theories would not exist. And without such maths on house construction, there is the high probability that people propose things such as the installation of 10 feet steel plates. And the result is that the cancellation rates in house construction would probably be comparable to nowadays cancellation rates in software construction.

Hence, we conclude that there is the urgent need to start doing observations in software construction. As a first step, it is necessary to identify what is needs to be observed -- similar to the cracks in the wall. The authors opinion is that human effort (in terms of working time) is a first step towards this direction. Based on these observations it is possible to gather knowledge that might finally lead to some (tested) theories about software construction.

2.12 Wait! Software Construction does not Follow any Laws of Physics!

Wait! There is another often heard spontaneous reaction on the previous arguments. The typical reaction is: “House construction depends on the laws of physics. But software construction does not!”. It is probably true that no direct implications are possible from the Newton Physics to software construction. Well, no direct implications of Newton Physics to medicine is probably known either -- which does not prevent medicine to apply scientific methods to study the effect of drugs, etc.

The main problem is, that the laws in software construction are not yet found. It is unclear what the general laws are that underly the understandability, readability, or maintainability of software. No general theory about software comprehension has been found yet. But this does not mean that such a theory does not exist. It just means that this has not been found yet.

There are people that insists on the non-existence of these theories. And from time to time these people are even found in academia. The astonishing thing about this argument is, that this would mean that software construction is a purely probabilistic process that cannot be directly influenced by some external interventions. If this is true, it implies that none of the new programming languages, IDEs, architectures, etc. would have any effect on software construction. This idea is not naive, it is crazy -- maybe even more crazy than building a house with a 10 feet steel plate.

3 The Meaning of the Story

The meaning of the story is, that in software construction we find ourselves probably more often than we want in situations where we seriously take 10 feet steel plates into account: crazy new ideas that in the end do harm on the original goal (the construction of something new). The goal of the story and its discussion is to illustrate that an obviously absurd situation stops being absurd as soon as it is considered as an anology to software construction. It should emphasize the author's opinion that there are lines of reasoning that we directly reject in our daily lives for some good reasons but that we do not reject as software engineers. This is an indicator that there are serious problems in software construction.

Our goal as software engineers should be in the future to identify those 10 feet steel plates upfront, i.e. before we actually build a certain software that crashes because of the newly applied technique. This should be done with the goal in mind that in the end we want to improve our discipline.

I often heard the argument that this story tries to kill innovation.**11 The main argument is, that although it might be possible that a new technique turns out to be dangerous or just useless, it is necessary to try it out. Otherwise it would not be possible to achieve any improvements or progress. Trying out new things is essential not only for the discipline of software construction but for the whole society.

I totally agree with it. There is a need to do improvements and to make progress. And I see the need to try out new techniques. However, I do not only see the duty to try something new, I see also the duty to deliver finally a product and the responsibility to deliver the product with a reasonable effort. I do not think that playing with new techniques is a reasonable and responsible step into the right direction. New techniques should be seriously studied and tested before taking into account in productive software systems.

Another goal of this story is to invite people to think more about to what extent currently applied techniques in software construction are completely unknown with respect to their risks. The name of this story, the 10 feet steel plate, might also be used as a metaphor in software construction as a technique that probably implies a very high risk and that comes with a lot of promises where it is unclear (and maybe even unprobable) whether it is able to keep any of these promises.

The right steps to prevent us from building software with 10 feet steel plates is to start doing observations. Not singular ones, but a large number of observations in controlled environments. These observations will lead to knowledge. And without such knowledge software construction will remain a mystique field full of wild ideas that in the end lead to houses constructed with 10 feet steel plates and an amazing number of houses that simply crash.

4 Acknowledgement

I would like to thank all the students that participated in the discussions about the 10 feet steel plate and mostly those students who were reluctant to accept the story as a valid analogy and who argued about the absurdity of the analogy.

Footnotes

**1 It needs to be mentioned that Vince is a software developer and software developers have from time to time the tendency to build things on their own from scratch.

**2 It is not completely clear where this interest interest comes from, but at least some first approaches exist to find possible explanations (see the great survey by Meyerovich and Rabkin for such possible explanations [3].

**3 The goal is not to repeat here the arguments of experimentation and falsification in software engineering. In case the reader is still interested in that, a different article discusses that in more detail (see [2]).

**4 Probably the reader has now the tendency to think that this discussion finally leads to the killing of all inventions. This is definitively not the goal. I come back to this objection in section 3. 

**5 The essay by Andreas Stefik and me speaks about the problem of unknown costs and unreliable statements about possible benefits of programming languages in the programming language community (see [4]).

**6 In fact, this question has not been studied so far in detail, but at least there are studies such as the one by Emam and Koru [1] that give first hints about how often software projects actually fail.

**7 The funny thing about this argument is, that in the 90s the software community was massivly interested in the works by Christopher Alexander, an architect that actually built houses. At that time, software developers rather spoke about the parallels between house construction and software construction -- which in the end led to the works on software patterns.

**8 In fact, observing the memory consumption at runtime is far from being trivial, but it would lead to far to discuss that here.

**9 This would mean that a simple test falsified the idea of the steel plate and hence he concluded the non-applicability of the technique.

**10 Again, this explanation is slightly too trivial, because a crack does not need to be something that can be seen by eyes but by some other instruments. Again, in order to keep the discussion simple, this is not discussed here in more detail.

**11 In fact, this is not only a reaction on this story but also a reaction on the demand to apply human-centered usability tests to software techniques in general.


Literature

[1] Khaled El Amam, A. Günes Koru. A replicated survey of IT software project failires. IEEE Software, 25 (5), pp. 84-90, 2008

[2] Stefan Hanenberg. Faith, hope, and love: An essay on software science's neglect of human factors. Onward 2014

[3] Leo Meyerovich and Ariel S. Rabkin. Empirical analysis of programming langauge adoption, OOPSLA'13, pp. 1-13, 2013

[4] Andreas Stefik and Stefan Hanenberg. The programming language wars: Questions and responsibilities for the programming language community, Onward 2014, 2014.