UUID

Information from The State of Sarkhan Official Records

Understanding UUIDs: The Good, The Bad, and The Ugly

What Is a UUID?

A UUID (Universally Unique Identifier) is a 128-bit number used to uniquely identify objects across different systems. Unlike traditional numeric IDs (such as AUTO_INCREMENT in MySQL), UUIDs are designed to be globally unique without requiring a central authority to issue them. This makes them widely used in distributed systems, databases, and applications where uniqueness is crucial.

UUIDs are typically represented as a 36-character string in the format:

550e8400-e29b-41d4-a716-446655440000

Each section of the UUID has a specific meaning, but the key takeaway is that this format allows for an enormous number of possible values (around 3.4 × 10³⁸, which is practically infinite for most use cases).


Why Use UUIDs?

UUIDs are used in various applications because they solve several common problems related to data uniqueness and distribution. Here’s why they are popular:

1. Globally Unique Without a Central Authority

In distributed systems, where multiple databases or services need to generate unique IDs independently, UUIDs guarantee uniqueness without requiring a central issuing authority (which could be a bottleneck).

2. Persistent Identification (Immutable Identifiers)

Unlike usernames or sequential IDs, UUIDs never change. A common example is Minecraft UUIDs—when a player changes their username, their UUID remains the same, ensuring that in-game ownership and bans remain linked to the same identity.

3. Prevents Data Collisions in Large Systems

For systems that merge data from different sources (e.g., multi-region databases, microservices), UUIDs eliminate the risk of duplicate IDs. Traditional AUTO_INCREMENT IDs can easily cause conflicts when syncing across databases.

4. Enhances Security & Privacy

Numeric IDs (1, 2, 3, 4...) expose the order of records, making it easier for attackers to infer information about a system. UUIDs, being random and non-sequential, obscure this metadata and help protect against enumeration attacks.


Why Are UUIDs Inefficient as Primary Keys?

Despite their benefits, UUIDs have some major downsides—especially when used as primary keys in relational databases.

1. Larger Storage Requirements

UUIDs are 128-bit values, whereas traditional integer-based primary keys (like BIGINT) are only 64-bit. This means:

  • Larger index sizes in databases.
  • Increased storage overhead (UUIDs take 16 bytes vs. 8 bytes for a BIGINT).
  • More memory usage when handling large datasets.

2. Poor Indexing Performance

Primary keys are often used as clustered indexes in databases. Since UUIDs are randomly generated, they don’t maintain any natural order, leading to:

  • Fragmentation in database indexes.
  • Slower insert performance because new records are scattered across different parts of the storage instead of being added sequentially.

3. Difficult for Humans to Work With

UUIDs are long, complex, and not human-friendly. Unlike numeric IDs (e.g., User #123), referencing an entity by its UUID (550e8400-e29b-41d4-a716-446655440000) is impractical for debugging, logging, and manual interactions.


Best Practices: Combining UUIDs with Auto-Increment IDs

The best way to use UUIDs in a database is to not use them as the primary key. Instead, use a traditional auto-increment integer as the primary key and store the UUID in a separate column.

Recommended Database Schema

CREATE TABLE users (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    uuid CHAR(36) UNIQUE NOT NULL,
    username VARCHAR(255) NOT NULL
);

Why This Works Better

Auto-increment id is the primary key, making indexing efficient.

UUIDs are stored separately and can still be used for external references.

Queries on id are faster, while uuid remains available for uniqueness.


Alternatives to Standard UUIDs

If you must use UUIDs as primary keys, consider performance-optimized versions:

🔹 UUIDv4 (Random-Based) – Default but Inefficient

  • Fully random.
  • Leads to poor indexing in databases.
  • Used in many applications, but not the best for performance.

🔹 UUIDv1 (Timestamp-Based) – More Ordered for Databases

  • Includes a timestamp component, making it more sequential.
  • Better for database performance compared to UUIDv4.

🔹 UUIDv7 (Proposed for Future Standards)

  • Uses a monotonic timestamp for better ordering.
  • Optimized for databases while maintaining uniqueness.

🔹 ULID (Universally Unique Lexicographically Sortable Identifier)

  • Similar to UUID but designed for sorting.
  • Uses timestamp-first encoding, reducing index fragmentation.
  • Much better for databases than standard UUIDs.

Final Thoughts

UUIDs are invaluable for ensuring uniqueness, but using them as primary keys in a relational database introduces significant performance issues. The best approach is to:

✔️ Use AUTO_INCREMENT integers as the primary key.

✔️ Store UUIDs separately for uniqueness and external references.

✔️ Consider UUIDv1, UUIDv7, or ULIDs for performance-sensitive applications.

By following these best practices, you get the best of both worlds—efficient database performance and globally unique identifiers where needed.

See Also