Manual memory management
In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry supported manual memory management, though garbage collection has existed since 1959, when it was introduced with Lisp. Today, however, languages with garbage collection such as Java are increasingly popular and the languages Objective-C and Swift provide similar functionality through Automatic Reference Counting. The main manually managed languages still in widespread use today are C and C++ – see C dynamic memory allocation.
Description
All programming languages use manual techniques to determine when to allocate a new object from the free store. C uses themalloc
function; C++ and Java use the new
operator; and many other languages allocate all objects from the free store. Determining when an object ought to be created is generally trivial and unproblematic, though techniques such as object pools mean an object may be created before immediate use. The real challenge is object destruction – determination of when an object is no longer needed, and arranging for its underlying storage to be returned to the free store for re-use. In manual memory allocation, this is also specified manually by the programmer; via functions such as free
in C, or the delete
operator in C++ – this contrasts with automatic destruction of objects held in automatic variables, notably local variables of functions, which are destroyed at the end of their scope in C and C++.Manual management and correctness
Manual memory management is known to enable several major classes of bugs into a program when used incorrectly, notably violations of memory safety or memory leaks. These are a significant source of security bugs.- When an unused object is never released back to the free store, this is known as a memory leak. In some cases, memory leaks may be tolerable, such as a program which "leaks" a bounded amount of memory over its lifetime, or a short-running program which relies on an operating system to deallocate its resources when it terminates. However, in many cases memory leaks occur in long-running programs, and in such cases an unbounded amount of memory is leaked. When this occurs, the size of the available free store continues to decrease over time; when it is finally exhausted, the program then crashes.
- Catastrophic failure of the dynamic memory management system may result when an object's backing memory is deleted out from under it more than once; an object is explicitly destroyed more than once; when, while using a pointer to manipulate an object not allocated on the free store, a programmer attempts to release said pointer's target object's backing memory; or when, while manipulating an object via a pointer to another, arbitrary area of memory managed by an unknown external task, thread, or process, a programmer corrupts that object's state, possibly in such a way as to write outside of its bounds and corrupt its memory management data. The result of such actions can include heap corruption, premature destruction of a different object which happens to occupy the same location in memory as the multiply deleted object, program crashes due to a segmentation fault and other forms of undefined behavior.
- Pointers to deleted objects become wild pointers if used post-deletion; attempting to use such pointers can result in difficult-to-diagnose bugs.
Resource Acquisition Is Initialization
Manual memory management has one correctness advantage, which is that it allows automatic resource management via the Resource Acquisition Is Initialization paradigm.This arises when objects own scarce system resources which must be relinquished when an object is destroyed – when the lifetime of the resource ownership should be tied to the lifetime of the object. Languages with manual management can arrange this by acquiring the resource during object initialization, and releasing during object destruction, which occurs at a precise time. This is known as Resource Acquisition Is Initialization.
This can also be used with deterministic reference counting. In C++, this ability is put to further use to automate memory deallocation within an otherwise-manual framework, use of the
shared_ptr
template in the language's standard library to perform memory management is a common paradigm. shared_ptr
is not suitable for all object usage patterns, however.This approach is not usable in most garbage collected languages – notably tracing garbage collectors or more advanced reference counting – due to finalization being non-deterministic, and sometimes not occurring at all. That is, it is difficult to define when or if a finalizer method might be called; this is commonly known as the finalizer problem. Java and other GC'd languages frequently use manual management for scarce system resources besides memory via the dispose pattern: any object which manages resources is expected to implement the
dispose
method, which releases any such resources and marks the object as inactive. Programmers are expected to invoke dispose
manually as appropriate to prevent "leaking" of scarce graphics resources. Depending on the finalize
method to release graphics resources is widely viewed as poor programming practice among Java programmers, and similarly the analogous __del__
method in Python cannot be relied on for releasing resources. For stack resources, this can be automated by various language constructs, such as Python's with
, C#'s using
or Java's try
-with-resources.Performance
Many advocates of manual memory management argue that it affords superior performance when compared to automatic techniques such as garbage collection. Traditionally latency was the biggest advantage, but this is no longer the case. Manual allocation frequently has superior locality of reference.Manual allocation is also known to be more appropriate for systems where memory is a scarce resource, due to faster reclamation. Memory systems can and do frequently "thrash" as the size of a program's working set approaches the size of available memory; unused objects in a garbage-collected system remain in an unreclaimed state for longer than in manually managed systems, because they are not immediately reclaimed, increasing the effective working set size.
Manual management has a number of documented performance disadvantages:
- Calls to
delete
and such incur an overhead each time they are made, this overhead can be amortized in garbage collection cycles. This is especially true of multithreaded applications, where delete calls must be synchronized. - The allocation routine may be more complicated, and slower. Some garbage collection schemes, such as those with heap compaction, can maintain the free store as a simple array of memory.
Manual allocation does not suffer from the long "pause" times that occur in simple stop-the-world garbage collection, although modern garbage collectors have collection cycles which are often not noticeable.
Manual memory management and garbage collection both suffer from potentially unbounded deallocation times – manual memory management because deallocating a single object may require deallocating its members, and recursively its members' members, etc., while garbage collection may have long collection cycles. This is especially an issue in real time systems, where unbounded collection cycles are generally unacceptable; real-time garbage collection is possible by pausing the garbage collector, while real-time manual memory management requires avoiding large deallocations, or manually pausing deallocation.