An insight on differences in programming approaches

5 min readJul 15, 2023

I’ve observed that my approach to computer programming usually differs from what others might use, but only now have I understood how to explain it.

There are many ways to create software to solve a problem. It’s also assumed that the software solution is developed further over a period of time. It’s well-known that a software solution does not grow evenly or predictably over its lifecycle, so it’s generally considered a good idea not to guess the future course of development at all. Programmers try to follow YAGNI principle, you know.

The first difference in my approach has been known to me for a long time now. I try to overcome short-sightedness by producing generic code where possible. Likely due to my innate tendency to zoom out and see the bigger picture, writing somewhat more generic solutions doesn’t usually involve extra effort. To illustrate, it’s usually as easy to implement X+Y as it is to implement 3+4, and from my experience, further requirements rarely stop there. Inevitably, 2+5 comes next, then 7+4, and so on.

However, I’ve recently discovered another bias I have in programming. I tend to prefer deeper embeddings, which I will try to explain below.

On embeddings

When programming, we use a programming language to encapsulate our knowledge and ideas into an executable form, which will help users solve problems in a specific domain or a mix of domains. Although we may not always understand it, the problem domain usually has some kind of conventional language and logic that subject matter experts operate within. In other words, we embed the language of the problem domain into the programming language.

In fact, in any project beyond a small scale, there are usually other embeddings as well. We may need to communicate with the database in its language, interface with APIs in their languages, utilize off-the-shelf libraries, and so on. We routinely embed one language into another when creating or extending software. This process can even be made explicit in larger projects, with the use of domain-specific languages (DSLs).

DSLs can vary widely, from full-blown embedded languages to languages written in the syntax of the host language, or even API languages of some third-party library, which implements parts of the problem domain [1].

Kinds of embeddings

It turned out [2], that embeddings can be shallow or deep, and then also in a higher-order abstract syntax.

Shallow embedding directly maps constructs in the embedded language to the behavior or computations in the host language. It doesn’t allow manipulation of the structure itself, even as simple as composition. It’s just what programmers call “hardcoding”. There is a bunch of design patterns, which purpose is to make shallow embedding less “hardcoded”.

Deep Embedding with First-Order Abstract Syntax is a more advanced approach. It maps constructs in the embedded language to data structures in the host language. Allows manipulation of the structure, but managing variables and bindings can be non-trivial and without proper discipline even ad hoc.

Deep Embedding with Higher-Order Abstract Syntax (HOAS). This one maps constructs in the embedded language to higher-order functions in the host language. Allows manipulation of the structure, and handles variables and bindings more naturally using the host language’s mechanisms.

It’s a palette

It’s quite easy to draw an association with Greenspun’s tenth rule:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

The levels of embeddings described above can be seen as a palette. Problems and solutions vary, and there’s no one-size-fits-all solution. There are also other considerations, such as how to persist DSL structures. It’s both easier and safer to limit DSL to a declarative specification, which will require interpretation with variable bindings, but can be stored and transmitted. (This is where Lisp’s homoiconicity property comes handy.)

Higher-order embedding can be useful for well-understood or very abstract domains, which can’t be directly exposed to subject matter experts (unless they are also experts in mathematics, computer science, or logic concepts). Higher-order embedding may also require software developers to truly understand what is going on. Judging by the popularity of languages like Lisp, Haskell, and various MLs, this approach isn’t for everyone. In exchange, this approach gives a very concise way to describe what is wanted from the computer.

In a sense, palette of embeddings is one more instance of what is known as the Fundamental Theorem of Software Engineering:[4]

We can solve any problem by introducing an extra level of indirection.

Modularity in software is not just a nice-to-have feature, but a genuine tool to combat the complexity of software. Ideally, adding functionality to software should be like throwing more code into a ‘bag’. This is only achievable in some exotic cases, like semantic web graphs connected with rule engines where code is data, or in some highly ‘frameworkish’ component-based systems. More commonly, adding new code requires changes to the existing one.

How typing is related to this?

This is what the expression problem is about. The need for deeper embeddings is an objective one due to its higher flexibility. Static compile-time typing inevitably leads to the extraction of the code, which needs to be flexible, into a DSL in one way or another. As a result, the host language’s typing system becomes less useful, as typing also appears to be done in the embedding. That’s correct: deeper embeddings may require their own typing system if the host language isn’t sufficient. Furthermore, there may be a need for an even more capable type system in the DSL when the code in the DSL (either directly or indirectly, via UI) is supplied by users who aren’t professional software developers.

Some programming languages possess robust capabilities for introspection and metaprogramming (adding another dimension for embeddings, from the type system). While metaprogramming can be extremely useful for implementing deeper embeddings, introspection is more akin to a band-aid solution than a truly direct approach.

Overall, the typing of the host language is not a decisive factor for the success of using embeddings.

Conclusion

I’ve found it incredibly beneficial to be able to choose the optimal level of embeddings for the programming tasks at hand, especially in long-term projects where architectural decisions significantly impact the required effort.

The only issue is that everyone on the team should be comfortable with this approach, as it can be very tempting to circumvent it with hardcoded kludges. At the very least, deeper embeddings should have clear methods to incorporate ‘shallow’ embeddings.

References

Jugel, U. (2010). van den Brand, M., Gašević, D., Gray, J., eds. “Generating Smart Wrapper Libraries for Arbitrary APIs”. Software Language Engineering. SLE 2009. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. 5969. DOI:10.1007/978–3–642–12107–4_24
Jérôme Vouillon. Shallow embedding of a logic in Coq. Universite Paris Diderot — Paris 7, CNRS https://www.cis.upenn.edu/~sweirich/wmm/wmm08/vouillon.pdf
https://en.wikipedia.org/wiki/Expression_problem
https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering