Generics and the Cost of Type Erasure

Prerequisites: build-tools-modules-project-structure, memory-model-gc-vs-ownership

What You'll Learn

Why Java's generic type parameters disappear at runtime (type erasure) and what that means in practice.
How type erasure contrasts with Rust's monomorphization, and the trade-offs each strategy makes.
How to read and write bounded type parameters (<T extends Comparable<T>>) and how they differ from Rust trait bounds.
How to apply PECS (Producer Extends, Consumer Super) to choose between <? extends T> and <? super T>.
Why Java generics cannot work with primitives directly, and what boxing costs you at runtime.

Why This Matters

You have written Rust generics. You know that fn identity<T>(x: T) -> T is zero-cost — the compiler stamps out separate machine code for every T you actually use. When you switch to Java and write <T> T identity(T x), it looks the same on the surface. It is not. Java generics are a compile-time convenience that evaporates before the JVM ever runs your code.

This matters for two concrete reasons. First, you cannot ask Java at runtime "is this a List<String> or a List<Integer>?" The JVM genuinely does not know — both types compile to identical bytecode. Second, every time you put an int into a List<Integer>, Java wraps it in an Integer object on the heap, because generic types only work with reference types. In a payment processor handling millions of records, that boxing accumulates.

Understanding type erasure also unlocks the logic behind Java's wildcard syntax — which looks cryptic until you see why it exists.

Core Concept

In Rust, generics use monomorphization: the compiler reads your generic function definition, finds every concrete type you instantiate it with, and generates a separate specialized function for each one. identity::<i32> and identity::<String> become two distinct functions in the binary. The type information is preserved because each compiled path works with exactly one type.

Java takes the opposite approach. Type erasure means the javac compiler removes all generic type parameters before producing bytecode. A Container<String> and a Container<Integer> compile to the same Container class, with fields typed as Object. Where your source code says T, the bytecode says Object (or the upper bound if you declared one). The compiler inserts type casts at every point where you read out of a generic container, so you still get type safety in source code — but at runtime, the casts are the only thing enforcing types.

Why did Java choose this? Backward compatibility. Generics were added in Java 5, and the JVM already had millions of lines of class files written without them. By erasing types, the Java team ensured that pre-generic bytecode and generic bytecode could interoperate without a new JVM instruction set. One compiled class works for all type arguments — that is the explicit trade-off.

Dynamic dispatch vs. static dispatch. Before going further, let us define two terms that will reappear throughout this plan:

Static dispatch is what Rust's monomorphization gives you. The compiler resolves exactly which code path to run for each type at compile time. No runtime lookup is needed. The compiler can inline, optimize, and reason about the concrete type.
Dynamic dispatch is Java's default for interface and virtual method calls. When you call validator.validate(payment), the JVM consults the object's virtual method table (vtable) at runtime to find the right implementation. Module 04 covers this in depth.

Monomorphization is a static dispatch strategy. Type erasure is neither static nor dynamic — it is erasure: the type information is simply removed, and Object takes its place.

Note: Type erasure and monomorphization are independent implementation decisions. Type erasure is about discarding type information at compile time. Monomorphization is about generating specialized code per concrete type. Java erases type information and uses one compiled class for all type arguments. Rust monomorphizes and preserves full type information. These choices are orthogonal. A language could theoretically monomorphize without preserving type info, or preserve type info without specializing code. Understanding this prevents the false assumption that "if Java monomorphized, it would automatically keep type information at runtime."

Bounded type parameters. Java lets you constrain a type parameter:

// Java
public <T extends Comparable<T>> T max(T a, T b) {
    return a.compareTo(b) >= 0 ? a : b;
}

This is similar to Rust's trait bound T: Ord, but the mechanics differ. Java bounds work through the class hierarchy — T extends Comparable<T> means T must be a subtype of Comparable<T>. Rust trait bounds are orthogonal to the type hierarchy: you can implement Ord for any type without it being a subclass of anything. Java's single-parent inheritance makes complex bounds less composable. You can write <T extends Serializable & Comparable<T>> using &, but the first bound must be a class; the rest must be interfaces.

Boxing and primitives. Java generics only work with reference types. There is no List<int> — you must use List<Integer>. Every time you add an int to a List<Integer>, Java performs boxing: it wraps the primitive value in an Integer object and allocates it on the heap. When you read it back and do arithmetic, Java unboxes it. This is autoboxing — the automatic form of boxing that the compiler inserts for you. As Module 02 explained, heap allocations put pressure on the garbage collector. A loop over a million integers stored as List<Integer> generates a million Integer objects. In a tight inner loop of a payment processor, this matters.

Raw types. A raw type is what you get when you use a generic class without its type parameter: List instead of List<String>. Raw types exist only for backward compatibility — to let pre-Java-5 code compile with generic classes. In new code, never use raw types. The compiler will warn you. Treat warnings about unchecked raw-type operations as errors.

Concrete Example

Here is what type erasure looks like from the compiler's perspective. You write this:

// Java — what you write
public class Container<T> {
    private T value;

    public void set(T value) { this.value = value; }
    public T get() { return value; }
}

After compilation, the bytecode is equivalent to this:

// Java — what the compiler produces (simplified)
public class Container {
    private Object value;

    public void set(Object value) { this.value = value; }
    public Object get() { return value; }
}

And at every call site where you write container.get() expecting a String, the compiler inserts an explicit cast:

// Java — compiler-inserted cast
Container<String> c = new Container<>();
c.set("hello");
String s = (String) c.get();  // cast inserted for you; throws ClassCastException if wrong type

Now here is the Rust equivalent, which takes the opposite path:

// Rust equivalent
fn identity<T>(x: T) -> T {
    x
}

// Two call sites → two specialized functions in the binary:
let s = identity("hello");   // compiles to identity::<&str>
let i = identity(42_i32);    // compiles to identity::<i32>

The Rust binary contains separate machine code for each instantiation. The Java bytecode contains one class. Neither approach is universally better — Java's saves binary space and compile time; Rust's enables better optimization and avoids boxing.

Now for the payment processor. Introduce this generic record in the com.example.payments package:

// Java — com/example/payments/Pair.java
package com.example.payments;

public record Pair<A, B>(A first, B second) {}

Use it to represent a payment alongside its exchange rate:

// Java
Pair<Payment, Double> result = new Pair<>(payment, 1.08);
System.out.println(result.first().amount());  // works fine

Now try to distinguish two Pair instantiations at runtime:

// Java — this does NOT compile
Object obj = new Pair<>("hello", 42);
if (obj instanceof Pair<String, Integer>) {  // compile error: illegal generic type for instanceof
    System.out.println("string-int pair");
}

The compiler refuses because Pair<String, Integer> and Pair<Integer, String> are indistinguishable at runtime — both are just Pair. You can only test obj instanceof Pair (without type arguments). The type parameters have been erased.

Analogy

Think of type erasure as a cookie cutter analogy.

A bakery makes one generic cookie cutter — a single metal mold. When the baker uses it for star-shaped cookies, she stamps them, boxes them with a "STAR" label, and ships them. When she uses it for heart-shaped cookies, the same mold is used, with a "HEART" label.

At delivery time (runtime), the delivery driver only sees unlabeled boxes. If a label fell off, there is no way to look at the mold and know — the mold is long gone in the truck. The driver must open the box and look (cast and inspect the object).

Rust's bakery has a magical mold-maker. For star cookies, it stamps out a purpose-built star mold. For heart cookies, a purpose-built heart mold. Each mold is unique, shaped exactly for its contents, and the driver always knows what is inside from the shape of the mold itself. The mold travels with the goods.

The Java bakery ships faster with less storage — one mold, infinite reuse. But the driver has to trust the labels, and if a label is wrong, cookies fall on the floor at runtime.

Going Deeper

Reifiable vs. non-reifiable types. A reifiable type is one whose full type information is available at runtime. Primitives (int, double), non-generic classes (String, Payment), and raw types (List) are reifiable. List<String> and List<Integer> are non-reifiable — the JVM cannot distinguish them. This is why you cannot create an array of a generic type: new T[10] is a compile error. Arrays preserve their component type at runtime (arrays are reifiable), but T is not known at runtime. The workaround is to pass a Class<T> token and use Array.newInstance, or simply use a List<T> instead of an array.

PECS: Producer Extends, Consumer Super. Java's wildcard system is use-site variance — you annotate each call site with how you intend to use the collection, rather than declaring variance in the type definition (as Kotlin and Scala do). The mnemonic is PECS: Producer Extends, Consumer Super.

If a method reads from a collection (the collection is a producer of values), use <? extends T>:

// Java — reading from a collection of Payment subtypes
public static double totalAmount(List<? extends Payment> payments) {
    double total = 0;
    for (Payment p : payments) {  // reading: safe
        total += p.amount();
    }
    // payments.add(new Payment(...));  // COMPILE ERROR: can't write
    return total;
}

If a method writes into a collection (the collection is a consumer of values), use <? super T>:

// Java — writing Payment objects into a compatible list
public static void fillWithDefault(List<? super Payment> sink) {
    sink.add(new Payment(0L, "ACC-001", "ACC-002", 0.0, "USD"));  // writing: safe
    // Payment p = sink.get(0);  // COMPILE ERROR: get() returns Object
}

In Rust, this complexity largely disappears because monomorphization generates specialized code for each concrete type. A function accepting impl IntoIterator<Item = Payment> just works without needing call-site annotations. PECS is Java's workaround for use-site variance in an erased type system.

Heap layout implications. Because Java generics erase to Object, generic containers store object references — pointers to heap-allocated objects. A List<Integer> holding a million integers stores a million Integer heap objects plus a million pointers in the list's internal array. In Rust, Vec<i32> stores the integers directly in a contiguous block of memory. The difference in cache locality alone can dominate performance in tight loops. This connects directly to Module 02's discussion of GC pressure: those million Integer objects are tracked by the garbage collector and must be collected eventually.

Bridge methods. When a generic class is subclassed, the compiler generates bridge methods to preserve polymorphism after erasure. This is an internal implementation detail you rarely need to worry about, but it explains why you occasionally see synthetic methods in stack traces or reflection output. The compiler inserts them automatically to maintain correct vtable entries.

Common Misconceptions

1. "Java generics are like C++ templates."

Not quite. C++ templates use a form closer to monomorphization — each template instantiation generates distinct code, similar to Rust. Java's type erasure means no type information survives to runtime. The compile-time safety is similar, but the runtime behavior is completely different. In C++, std::vector<int> and std::vector<std::string> are distinct types at runtime. In Java, List<Integer> and List<String> are the same type at runtime.

2. "Type erasure makes Java generics unsafe."

Not in well-written code. The javac compiler enforces all type safety before erasure happens. The compiled bytecode contains the necessary casts, and those casts are verified at runtime. The danger arises only when you mix raw types with generic types, or use reflection to bypass type checking. If you see an "unchecked cast" compiler warning, treat it seriously — that warning is telling you that type safety is no longer guaranteed for that operation.

3. "Wildcards in Java do what Rust trait bounds do."

Partially true, but they solve it differently. Rust trait bounds are definition-site: you write fn foo<T: Ord>(x: T) and the constraint is part of the function definition. Java wildcards are use-site: you write List<? extends Number> at each call site to express how you plan to use the collection. Both constrain which types are acceptable, but wildcards constrain a single usage, while trait bounds constrain all usages of a generic function. This difference means Java APIs often need more annotations at call sites than equivalent Rust APIs.

Check Your Understanding

After javac compiles class Box<T> { T value; }, what type does the value field have in the bytecode?

Answer: Object. Type erasure replaces the unbounded type parameter T with Object. If T had a bound — for example, T extends Comparable<T> — the field would be typed as Comparable (the first bound).
Why can't you write new T[10] inside a generic method in Java?

Answer: Array creation requires a reifiable type because arrays store their component type at runtime and perform a runtime type check on every write. T is not reifiable — it is erased and not available at runtime. The compiler rejects new T[10] because it cannot generate a valid array creation instruction. The workaround is to accept a Class<T> token and use Array.newInstance(clazz, 10).
You have a List<Integer> containing the values 1 through 1,000,000. How does this compare to Vec<i32> in Rust from a memory layout perspective, and why does it matter for GC?

Answer: List<Integer> stores one million Integer objects on the heap, plus an internal array of one million pointers (references) into those objects. Vec<i32> stores the integers directly and contiguously in one heap-allocated block. The Java version generates one million extra heap objects that the garbage collector must track and eventually collect — this is the boxing overhead described in Module 02.
You want to write a method that accepts a list and appends default Payment objects to it. Should you use List<? extends Payment> or List<? super Payment>? Why?

Answer: Use List<? super Payment>. Your method is a consumer — it writes into the list. PECS says: Consumer Super. List<? super Payment> allows the list to be a List<Payment>, List<Object>, or any list that can hold Payment values. List<? extends Payment> would prevent you from writing at all, because the compiler cannot guarantee the list's exact type is compatible for insertion.
At runtime, can you distinguish a Pair<String, Integer> from a Pair<Integer, String> using instanceof?

Answer: No. Type erasure removes the type arguments at compile time. Both are the same Pair class at runtime. instanceof Pair<String, Integer> is a compile error. The best you can do is instanceof Pair, which matches any Pair regardless of its type arguments.

Key Takeaways

Type erasure removes all generic type parameters at compile time, replacing them with Object or the declared upper bound. One compiled class serves all type arguments. This is the opposite of Rust's monomorphization, which generates specialized code per concrete type.
Static dispatch (Rust's default via monomorphization) resolves code paths at compile time. Dynamic dispatch (Java's default for virtual calls) resolves them at runtime via vtable lookup. These terms are defined here and used consistently throughout this plan.
Boxing is the unavoidable cost of Java generics with primitives. List<Integer> wraps every int in a heap-allocated Integer object. In performance-sensitive paths, this drives GC pressure.
PECS (Producer Extends, Consumer Super) is the practical rule for wildcards: <? extends T> when you only read from a collection; <? super T> when you only write into it.
Raw types exist only for backward compatibility. Never use them in new code — they silently disable compile-time type checking.

References

Type Erasure — The Java Tutorials (Oracle) — The official explanation of how javac replaces type parameters and inserts casts; essential reading for understanding what the compiler actually does.
Type Erasure — Dev.java — The modern Java learning platform's treatment of type erasure, including reifiable types and bridge methods.
Guidelines for Wildcard Use — The Java Tutorials (Oracle) — Oracle's official guidance on when to use ? extends vs ? super, directly informing the PECS discussion.
Rust generics vs Java generics — fasterthanli.me — A detailed comparison from a Rust perspective covering monomorphization, boxing, and the trait-bound vs class-hierarchy distinction.
Monomorphization — Rust Compiler Development Guide — The authoritative description of how rustc performs monomorphization, useful for understanding exactly what Java is not doing.