UUID
Understanding UUIDs: The Good, The Bad, and The Ugly
What Is a UUID?
A UUID (Universally Unique Identifier) is a 128-bit number used to uniquely identify objects across different systems. Unlike traditional numeric IDs (such as AUTO_INCREMENT
in MySQL), UUIDs are designed to be globally unique without requiring a central authority to issue them. This makes them widely used in distributed systems, databases, and applications where uniqueness is crucial.
UUIDs are typically represented as a 36-character string in the format:
550e8400-e29b-41d4-a716-446655440000
Each section of the UUID has a specific meaning, but the key takeaway is that this format allows for an enormous number of possible values (around 3.4 × 10³⁸, which is practically infinite for most use cases).
Why Use UUIDs?
UUIDs are used in various applications because they solve several common problems related to data uniqueness and distribution. Here’s why they are popular:
✅ 1. Globally Unique Without a Central Authority
In distributed systems, where multiple databases or services need to generate unique IDs independently, UUIDs guarantee uniqueness without requiring a central issuing authority (which could be a bottleneck).
✅ 2. Persistent Identification (Immutable Identifiers)
Unlike usernames or sequential IDs, UUIDs never change. A common example is Minecraft UUIDs—when a player changes their username, their UUID remains the same, ensuring that in-game ownership and bans remain linked to the same identity.
✅ 3. Prevents Data Collisions in Large Systems
For systems that merge data from different sources (e.g., multi-region databases, microservices), UUIDs eliminate the risk of duplicate IDs. Traditional AUTO_INCREMENT
IDs can easily cause conflicts when syncing across databases.
✅ 4. Enhances Security & Privacy
Numeric IDs (1, 2, 3, 4...
) expose the order of records, making it easier for attackers to infer information about a system. UUIDs, being random and non-sequential, obscure this metadata and help protect against enumeration attacks.
Why Are UUIDs Inefficient as Primary Keys?
Despite their benefits, UUIDs have some major downsides—especially when used as primary keys in relational databases.
❌ 1. Larger Storage Requirements
UUIDs are 128-bit values, whereas traditional integer-based primary keys (like BIGINT
) are only 64-bit. This means:
- Larger index sizes in databases.
- Increased storage overhead (UUIDs take 16 bytes vs. 8 bytes for a
BIGINT
). - More memory usage when handling large datasets.
❌ 2. Poor Indexing Performance
Primary keys are often used as clustered indexes in databases. Since UUIDs are randomly generated, they don’t maintain any natural order, leading to:
- Fragmentation in database indexes.
- Slower insert performance because new records are scattered across different parts of the storage instead of being added sequentially.
❌ 3. Difficult for Humans to Work With
UUIDs are long, complex, and not human-friendly. Unlike numeric IDs (e.g., User #123
), referencing an entity by its UUID (550e8400-e29b-41d4-a716-446655440000
) is impractical for debugging, logging, and manual interactions.
Best Practices: Combining UUIDs with Auto-Increment IDs
The best way to use UUIDs in a database is to not use them as the primary key. Instead, use a traditional auto-increment integer as the primary key and store the UUID in a separate column.
Recommended Database Schema
CREATE TABLE users (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
uuid CHAR(36) UNIQUE NOT NULL,
username VARCHAR(255) NOT NULL
);
Why This Works Better
✅ Auto-increment id
is the primary key, making indexing efficient.
✅ UUIDs are stored separately and can still be used for external references.
✅ Queries on id
are faster, while uuid
remains available for uniqueness.
Alternatives to Standard UUIDs
If you must use UUIDs as primary keys, consider performance-optimized versions:
🔹 UUIDv4 (Random-Based) – Default but Inefficient
- Fully random.
- Leads to poor indexing in databases.
- Used in many applications, but not the best for performance.
🔹 UUIDv1 (Timestamp-Based) – More Ordered for Databases
- Includes a timestamp component, making it more sequential.
- Better for database performance compared to UUIDv4.
🔹 UUIDv7 (Proposed for Future Standards)
- Uses a monotonic timestamp for better ordering.
- Optimized for databases while maintaining uniqueness.
🔹 ULID (Universally Unique Lexicographically Sortable Identifier)
- Similar to UUID but designed for sorting.
- Uses timestamp-first encoding, reducing index fragmentation.
- Much better for databases than standard UUIDs.
Final Thoughts
UUIDs are invaluable for ensuring uniqueness, but using them as primary keys in a relational database introduces significant performance issues. The best approach is to:
✔️ Use AUTO_INCREMENT
integers as the primary key.
✔️ Store UUIDs separately for uniqueness and external references.
✔️ Consider UUIDv1, UUIDv7, or ULIDs for performance-sensitive applications.
By following these best practices, you get the best of both worlds—efficient database performance and globally unique identifiers where needed.