Generating Source Code

Some people think that I am a big fan of code generators, because of my work on JSXP. There are others who think I am totally opposed to it, since I often complain about it. This article tries to clarify my view on code generators.

Code Generation in JSXP

In JSXP, we rely heavily on code generation. Every time you change a view (typically an XHTML file), you have to generate a base class for the view controller. If you configure your IDE correctly, the base class is generated automatically when you save an X(HT)ML file in a certain directory, so this is not something that interrupts your work (hopefully). Anyway, there is a break between working on the design and working on the code, and there is the danger of inconsitencies (if you forget to generate the controller).

So, why did we use code generation?

First, because it solves a technical problem that would be otherwise hard to solve. The problem most Java web frameworks have is, that at some point, you have to use literal strings to connect your code to your design. This can mean referencing a “wicket:id” from Java code, or writing EL in JSF. Even worse, some frameworks force you to mix code and design (EL in JSF - EL is code, even if the creators of JSF sometimes deny that fact). JSXP solves this problem by creating type save, compile time verifyable getters for all named elements from the design and setters for variables in the design.

Also, the code generator in JSXP connects two entirely different aspects of a web application, the user interface design and the backing Java code. This enables you to either work on the design, or work on the code, and the generator creates a bridge between these two. The generated code has the advantage that you have to test less: All elements you reference in the code are guaranteed to exist in the design, so you do not have to unit test this connection between code and design (you have to test this in Wicket of JSF - these frameworks detect such errors at runtime).

Code Generation in MDSD

Compare the above to code generation in Model Driven Software Development (MDSD). There, code is created from a model - typically some graphical model (UML) or a textual model (domain specific language, DSL).

Since the source code is also an abstract model of reality, this is a model-to-model transformation. In other words, one representation of some aspect of the application is transformed into some other representation of the same aspect of the application. It does not solve a technical problem, only an organisatonal: To make sure that different views on the same aspect are consistent.

This is a little bit like what the compiler does: The compiler transforms code from some high level programming language to machine code. So, nothing wrong with that, right? Then, why am I opposed to this kind of code generation?

The reason we use compilers is, that nobody would want to write machine code. You could not create a large software project by developing it in machine code - it is very hard, close enough to impossible to not even try it. So, if we want to create Blub (the hypothetical programming language from Beating the averages by Paul Graham) source code, this means that we do not want to write Blub code, maybe because Blub is not powerful enough. But if it is not powerful enough, why even use it as an intermediate language? Why not use a powerful language (whose compiler creates machine code) in the first place? Also, I think that Java is a good language for a lot of problems, so I don’t think that the reason for code generation in MDSD is related to the reason we use compilers.

Another reason that is often given - I already mentioned it earlier - is to keep the model and the code consistent. But (I also already mentioned that), the code is a model. And I think that the code should be the model. The question is: Why would we want a different representation of the same model?

Maybe because graphical representations are sometimes easier to read. But then, graphical representations do not scale well, so they are only useful for very small parts of a large system. Because of this, it is not really useful for designing large system, only (maybe) for documenting parts. But for the documentation, there is a better solution than generating the code from the graphical representation: Generating the graphical representation from the code! (doxygen does a quite good job for creating useful overview graphics).

The other source for generating code in MDSD are DSLs: Wouldn’t it be nice if we could write software in the language of the domain? Well, no. It would not. First of all, a DSL is still a formal language, and your users/customers will not be able to write code in a DSL directly. So you’ll need programmers to write the code anyway. And you need programmers to create all the DSL tools. Second, the DSL is overhead, because there will be parts of the program you’ll have to write in Bulb anyway. And you have to get the connection between the DLS and Bulb rigth.

Even worse, you can not debug or unit test graphical representations or DSLs. This means when anything goes wrong, you’ll debug or unit test generated code. As if this was not enough fun yet, as soon as you find a bug, you can not fix it in the code. You have to check the graphical/textual representation which was the input to the generator. If the problem lies there: Fix it, generate code, maybe fix the unit test because the newly generated artifacts don’t work with the old unit tests, and then test again. If the problem was not in the generator input: Fix the generator, then fix any problems in the model that were caused by the generator changes, fix the unit tests and test again. Did I mention, there are no refactoring tools for your graphical representation or DSL?

Conclusion

Generating code in MDSD is simply a waste of time and resources. Even worse, it will slow down development, with no real benefit for anybody (except consulting companies and the management who gets a false sense of control).

If you want to keep your code in sync with the requirements and the language of the domain, take a look at “Domain Driven Design” by Eric Evans. Draw you UML on a whiteboard, and delete it as soon as the code does what the diagrams said.

Even in JSXP, I am not entirely happy with the code generator, because there is the danger that the generated sources do not match the XML designs. It would be much cooler if we could write something like this:

public class IndexXhtmlView extends "index.xhtml"

and have a compiler that gets that right. We did not even think about this possibility since it would be very hard to make that working in Java.

You might be also interested in… Learn more about how I can help you save money and earn money by applying code generation in a way that makes sense.

David Tanzer

Coach | Consultant | Trainer

Generating Source Code

Code Generation in JSXP

Code Generation in MDSD

Conclusion