Why does one stand alone anti-spyware program (just as an example) take up 10MB on my machine but a similar anti-spyware program
takes up 150MB? They both have a GUI; have real time scans; scan for spyware, adware, trojans, keyloggers, rootkits; automatic
signature definition updates; allow the user to configure what gets scanned, when it gets scanned, etc. etc. And they’re both
Other than one program having a fancier dashboard, they seem to be almost identical in functionality and they’ve been rated just
about equal by various reviewers. Does the larger program hog more CPU resources simply because it is larger?
I’m sure it’s kinda strange to the non technical person, but what you describe is very, very common. Programs that do similar
things are often dramatically different in both size, and speed.
The answer’s actually fairly complex, since there are many things that factor in – every thing from a variety of choices made by
the software’s designer, to the age of the product.
The most obvious difference that comes to mind is the choice of programming language in which the software is written. Different languages are more or less efficient in converting human readable instructions into the sequence of actual machine instructions that the CPU executes. Sometimes those differences can be quite large.
Now, of course, one might ask “well, then why don’t they all choose the one that generates the smallest/fastest program?” – and of course it’s not that simple. The trade-off is typically development time. Different programming languages typically require different amounts of work to write the same program. At one extreme, one could write a program in assembly language, and it could be extremely small and fast – and would probably take twice as long, if not longer, to write than if the program were written in a higher level programming language like C++, Java or others. In today’s business climate, when you can deliver is often as important as what you can deliver, so the amount of time it takes just to write the software is a serious consideration.
Choice of programming language goes well beyond simple development speed. Things like staff training, appropriateness to the task, availability of development and diagnostic tools for that language can all play a part. Even personal taste – since many consider programming as much an art as a science – can play an important role in this fundamental choice.
Which leads to another difference: software design. For any problem that can be solved with software, there are as many ways to write that solution as there are programmers who would write it. Many solutions are simple, fast and elegant – as I said, almost an art. But there are two important things to note: not all programmers are alike, and not all solutions are the obvious choice.
There’s a rule of thumb among software development that the best engineers are generally 10 times better than the average. And the worst are 10 times worst than the average. That’s quite a wide spectrum of ability within the programming community. Now “best” and “worst” are fuzzy terms, but a good way to illustrate the difference might be to put it this way: what a great programmer might be able to write in a single “line” of programming language, a less talented individual might solve with significantly more code. And of course more code means a bigger program.
It’s also not always the case that a single line of code is the right choice, depending on the overall design goals of the software. Two different solutions, perhaps very different in size, may solve the same problem but do so in ways that expose other significant differences. For example, the smaller solution might be exceptionally difficult to understand and very fragile, while the larger could be very simple to understand and difficult to break. A design decision might be made to make a trade-off of stability over size. The result might be that the larger solution would be the one implemented.
Another difference that’s often overlooked is something called runtime libraries or runtime support.
Programmers rarely write everything from scratch. They assume and use existing libraries of software that already exist. As a simple example a programmer should never have to write software to concatenate (join) to strings of text (say “ABC” and “XYZ”) together into a single string (“ABCXYZ”). There are library routines that exist that a programmer can use to do exactly that without having to write (and test and debug) that code him or herself.
In an ideal world, such a library of existing software would be very granular, picking up only those routines actually needed by the program being written. Alas, this is not an ideal world, and one of the differences between programming languages and systems used is, in fact, the size of the runtime library that they drag along. In one language, using a string concatenation function might cause exactly and only that function to be included. In another language, it might cause that, and several other string functions to also be included in your program, whether or not you actually use them.
At a slightly higher level this is actually why things like the .NET framework exist. The .NET framework includes many higher level functions that can be used to make accessing Windows and other services easier for programmers. By programming to it, applications can assume that everything in the .NET framework is already on the machine – which means that they don’t have to write, or include, any of that functionality in their software. The result is that their programs appear smaller. (.NET is just an example, there are actually many approaches, both similar and different, from the Visual Basic runtime to the Java virtual machine.)
A final aspect I want to address is software age. Left to itself, software only gets larger. By that I mean that as software progresses from version to version and reacts to the changing landscape of both operating system changes and changing user requirements, it only grows in size.
The term “ball of mud” has been used to describe software growth. Each new feature, each new request, each new demand on the software just causes a little bit more mud to get slapped on the ball until the ball becomes enormous (and occasionally collapses under its own weight).
There are a couple of ways this can happen: new features of course require additional code. That much is fairly obvious; more work means more instructions to the computer on how to perform that work. Changed features, sometimes even removal of features, can also often cause the program size to expand in unexpected ways. As one simple example, it’s often more expedient (and safer) to disable a feature rather than remove it. The code to implement the feature remains, but is never accessed because code is added to prevent that access from occurring. Particularly in already large and complex programs this, and similar side effects of code changes, often cause growth.
In fact, often the only way to make an elderly program smaller is to chuck it and re-write it from scratch. This is typically a very expensive proposition, but depending on the age, stability and ease of maintenance of the older software it can be a very legitimate and lucrative choice. When starting over every design decision and choice, starting from the programming language to be used on up, is open for review.
Finally, size is not an indicator of efficiency or expected CPU usage. Size and speed are, essentially, two independent things. While software designers strive for small and fast, in fact the complexities of software design often include many tradeoffs that may, or may not, relate the two. Small programs can be fast or slow. Large programs can be fast or slow.