Software is not what most of us think

9 min readSep 22, 2024

There is a lot of software created in this world already. Many people are using computers or other computing devices, and some subset of them are making software. But do we understand what software is?

There is a valuable approach to thinking about software that helps clarify its fundamentals, aiding in the creation or selection of quality solutions. This principle can guide decision-makers — whether in the general public or at the CTO level — in choosing whether to buy, develop in-house, or use popular low/no-code tools. For developers and designers, it also offers insight into when to use copy-pasted or AI-generated code.

Thinking of software as a tool that helps achieve specific goals is accurate, and most users can easily relate to this perspective. It answers the question ‘Why?’ but doesn’t necessarily help when it comes to choosing or creating software.

From a developer’s perspective, it’s also important to consider patterns, development tools, libraries, frameworks, infrastructure, and code generation tools. This answers the question ‘How?’. In this area, there is much debate about which programming language, integrated development environment, framework, or code-generation tool to use.

Naturally, the question ‘What?’ arises. What is software? We argue that understanding this is key to becoming a good software developer, designer, or competent decision-maker.

Software is knowledge

The answer is simple, and the follow-up questions are easy to understand. Knowledge of what? It’s the knowledge of our problem (or business) domain, complemented by an understanding of the solution, broadly defined. And yes, the ‘problem domain’ is not a single entity, but a nested, often multi-layered whole, which extends all the way down to the computing hardware as a chain of ‘problem-solution’ links.

It’s also clear that an elegant software stack avoids introducing unnecessary complexity — meaning it doesn’t create problems that require additional solutions, though sometimes this is an unavoidable necessity when working with well-established, existing software.

When recruiting programmers, attention is paid to the tools they know well and whether they are knowledgeable about the industry they will be developing software for. The ability to learn and acquire new knowledge is also a plus. This is because software developers work with the upper part of the problem-solution chains mentioned above. They are knowledge workers. It is also easy to contemplate here that programmers should be able to easily follow the chain of abstraction, all the way from the higher-level business context to the deeper parts of the stack.

In this framework, we can understand the limitations of LLM-based zero-code approaches. Does the LLM possess the knowledge of the domain and the typical solutions for what you want to achieve? The answer is often ‘yes,’ but the quality of that knowledge may vary. For example, it’s within the reach of generative AI to create a database schema, tailor queries, and build a user interface on top of it. However, real-world software often has many more requirements. What works for a few thousand records may not scale to handle hundreds of millions, and so on. And because an LLM contains ‘average knowledge,’ human guidance and insight are needed to steer it toward truly better, innovative solutions. A layperson is capable of coming up with innovative ideas at the top of the abstraction layers, but often deeper understanding and adjustments to the underlying parts are necessary to achieve the desired results.

There are many ways software code can be created and structured. Code (and the accompanying data) is the representation of knowledge in a computing system. It can be very rigid and not adaptable, or it can be highly flexible. It may be visible to the programmer, as in traditional programming approaches, or hidden, as in some no-code solutions. We can identify the source code — the code that forms the foundation of the software being built. In some cases, there may be no clear source code or a non-textual, often proprietary, representation of knowledge locked to the maker of the tool. Many interesting solutions end up gathering dust because the knowledge was hidden within proprietary systems from vendors who went out of business, leaving others who relied on them stranded.

Thus, the fluidity of knowledge in the software code is crucial. When knowledge is flexible, accessible, and transferable, it ensures that solutions can evolve and adapt over time. However, when it’s locked within proprietary systems, the loss or inaccessibility of that knowledge can render even the most innovative solutions obsolete.

Clean source code represents this kind of fluid, accessible knowledge. It avoids mixing different links of the problem-solution chains, ensuring clarity and adaptability. When knowledge is structured this way, it remains flexible and transferable, reducing the risk of obsolescence and supporting long-term innovation.

In some cases, not everything is known beforehand, and knowledge acquisition becomes part of the solution. For example, by providing building blocks according to the user’s mental model, running simulations, applying machine learning, or using a solid logic framework to fill in gaps through inference. At this point, the knowledge shifts to a meta-level: we need knowledge on how to obtain knowledge, ultimately reducing the entire process to the same principle we are discussing here.

The logic of knowledge, its organizational principle, is crucial. Knowledge isn’t just a collection of isolated facts but typically includes a logical framework that connects them. This framework can be either exact or probabilistic, but it always imposes some form of constraint. Knowledge alone, however, is only part of the equation. The ability to ‘think’ — in the sense of navigating through layers of abstraction — is equally important. Here, ‘thinking’ refers to moving through these layers or chains of abstractions, as discussed earlier. For example, a source code interpreter is a piece of software that ‘thinks’ about another piece of code, activating the knowledge embedded within it. Facts alone can’t achieve this; a logical framework is required to process and apply knowledge effectively.

The corollary is that, for example, generative AI cannot arrive at a solution if it lacks the necessary knowledge or, worse, has incorrect information or data. The principle of ‘garbage in, garbage out’ applies here as well. An LLM can contain a vast amount of knowledge in an easily accessible way, through intricate associations, but when it doesn’t, gaps are filled with guesses.

What is needed to create a software solution?

As you can already guess, one needs knowledge about the problem domain (the top-level one) to start thinking about a solution. This top-level knowledge can be obtained from different sources and with varying levels of detail and accuracy. There is always a core set of knowledge that needs to be understood. For example, if we want to build a system that solves a problem similar to a competitor’s, we might look at what the competitor is doing. However, it’s important to understand that another system may not be an ideal representation of domain knowledge for various reasons, such as organic growth, poor discovery, mediocre implementation, technical limitations due to tooling choices, or mimicking traditional “paper-based” business processes. This suboptimality can reach the point where the user interface and even the available actions in the prior system should be taken with a grain of salt.

Growing the core knowledge model is a much better and more compact way to represent the conceptual knowledge of the system. It’s easy to think in these terms and to create user-oriented representations, as there can be many outward representations, but the underlying model remains one. This also makes it easy to identify possible shortcuts. The core knowledge is usually so compact that it doesn’t require much cognitive effort to process. And if certain facts aren’t reducible to more compact rules, they can be transferred as a list of constraints all the way to the solution.

The core knowledge model can directly inform the data model and even the software design itself (see, for example, Domain-Driven Design (DDD) or semantic design).

How this helps?

The concepts described above are quite abstract, and not every reader may have made it this far. Software development — whether through code or no-code — requires the ability to acquire and process knowledge. Hopefully, readers interested in this topic have the ability to see the concrete applications of the abstract principles explained in this article. Traditionally, programming has been viewed primarily as a manual formalization process, where knowledge is structured in a way that machines can process. Since the advent of the first computers, significant progress has been made to ease this process. For example, we now have generative AI that can assist in transforming and formalizing knowledge to some extent.

The knowledge principle, which we’ve discussed — where software revolves around knowledge — is quite simple to apply. The key is to identify what each agent in the system, whether it’s a machine, AI, or human, knows and assess whether that knowledge is enough to solve the problem. For instance, while a large language model (LLM) may have access to vast amounts of information and can retrieve it on demand, there’s no guarantee that it possesses all the necessary inputs or even the right knowledge for a specific situation. There is no magic involved. You can prompt an LLM to produce something, but if it lacks the relevant knowledge, the outcome will likely fall short of what the user needs — except perhaps in creative cases like writing fairy tales, where gaps in real knowledge might not matter as much.

A naive approach can work in some common and well-known cases, but ‘naive’ usually means ‘not knowing what one doesn’t know.’ This can be dangerous in many areas of software development, leading to failures or continuous delays.

The knowledge principle applies in cases the code needs a rewrite. Code may need a rewrite, but unless the problem the code solves changed dramatically, the knowledge in the code better be preserved. Otherwise it will be painfully gathered again.

On all levels

In software development, knowledge is essential at both higher and lower levels. At the high level, understanding the business problem and domain is critical. Developers and decision-makers must engage with stakeholders to capture the core needs of the system, translating them into software requirements. This knowledge informs major design decisions, such as system architecture and feature prioritization. Simply mimicking competitors can lead to suboptimal solutions — it’s essential to develop a domain-specific understanding that drives unique, effective software.

At the mid-level, domain knowledge drives design and architecture. Approaches like Domain-Driven Design (DDD) or Semantic Design emphasize that the technical models should reflect real-world business logic. This ensures that the software remains both scalable and flexible without adding unnecessary complexity. Here, the management of knowledge is more important than merely following programming conventions or relying heavily on advanced tools. The tools, while useful, must serve the deeper goal of managing and structuring knowledge effectively.

At the low level, technical proficiency with code, languages, and frameworks is crucial, but it is still secondary to the knowledge that informs how these tools are used. Clean, well-structured code is a vehicle for making knowledge fluid and accessible to future developers. This fluidity ensures that the system can evolve over time. Even though technology changes rapidly, the core principle remains: software development is fundamentally about managing knowledge, not just about using the latest technologies or adhering to rituals and practices.

Ultimately, software that succeeds in the long term is built with the intelligent management of knowledge — whether human or automated — at every level of abstraction. Low-code or zero-code solutions can offer prototyping efficiency but often lack the depth of domain knowledge required to solve complex, evolving problems. By prioritizing knowledge over tools and traditions, software remains adaptable and effective while both the domain and the technology landscape change.

Conclusion

In brief, when developing software or any data processing solution, ensure that at each level, knowledge is not assumed to appear magically from nowhere. While information, which forms the basis of knowledge, can change form or seem to ‘expand,’ it cannot be created out of nothing. Making a habit of applying this ‘knowledge preservation’ principle can lead to more robust, cost-effective, and thoughtful solutions. A core knowledge model can serve as the seed from which to expand. Given the ‘seed,’ the rest of the design grows beautifully.