JPA Entity Id Handling


Published: 2019-12-29
Updated: 2020-01-07
Web: https://fritzthecat-blog.blogspot.com/2019/12/jpa-entity-id-handling.html


Recently I presented a JPA BaseEntity class that implements the hashCode/equals contract once-and-only-once for all sub-classes. The JPA specification does not demand to override hashCode/equals in entity classes, but for consistent handling of entities in a big project it may be recommendable.

In context of distributed systems, entities being in hash-containers, and ids occurring in URLs, different questions rise around the nature and behaviour of an entity's id. The JPA specification chapter 2.4 states, for primary keys:

The value of its primary key uniquely identifies an entity instance within a persistence context and to EntityManager operations as described in Chapter 3, "Entity Operations". The application must not change the value of the primary key. The behavior is undefined if this occurs.

This Blog is a review of the BaseEntity implementation. Big background question is: "Which scope does the id have to cover?" - Unique per table? Per database? Globally unique?

Nature

Many web pages discuss the nature of ids. Search the web for "JPA primary key UUID versus number". Here comes my personal summary.

UUID Generator versus SEQUENCE Number

(1)
An UUID (Universally-Unique-Identifier) is 36 bytes long, consisting of 32 hex-digits and 4 dashes. Such UUIDs are globally unique in space and time, even if they were generated on different computers. The application, not the database, would generate them.

@MappedSuperclass
public abstract class BaseEntity
{
@Id
private UUID id = UUID.randomUUID();
....
}

(2)
Drawing Number ids from database sequences is a tradition, although surprisingly not all database products support sequences. Most old databases have numeric primary keys with running sequences, so setting up an object-relational model for an existing database may leave no other choice than using numeric keys.
Mind that there is no such thing like a numeric UUID, numbers can not be globally unique. So the scope of numeric ids is always table or database instance.

@MappedSuperclass
public abstract class BaseEntity
{
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "DatabaseGlobalSequence")
@SequenceGenerator(name = "DatabaseGlobalSequence", initialValue = 1, allocationSize = 10)
private Long id;
....
}
→ If you want to (or must) use one sequence per table, you can not keep the id in BaseEntity class. In that case you must put it into all sub-classes, and write different sequence names into the related annotations. Anyway, using an public abstract getId() method you can keep the code for equals() and hashCode() in BaseEntity.

So, if you have the freedom to choose your primary key class ...

Pro UUID:

Contra UUID:

So What?

All web articles that are against UUID argue with performance- and storage-consumption reasons. They were dealing with big data and poor responsiveness. So a common advice may be:

Don't expect silver bullets to work:-)

Behavior

Following is about the sense of durable of hashCode/equals implementations.

Unit Test

Here is a unit test that targets the behavior of the primary key, concerning the "undefined behavior" in case the id changes during lifetime of the entity.
For the complete implementation and how to turn this into a concrete test with a certain JPA provider please see my recent Blog.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public abstract class JpaTest
{
....

/** Setting the id changes hashCode() and equals(). */
@Test
public void shouldNotDisappearFromHashContainerWhenPersisting() {
final BaseEntity person = newPerson("John Doe");

final Set<Object> set = new HashSet<>();
set.add(person);
assertTrue(set.contains(person));

transactional(em::persist, person);

try {
assertTrue(set.contains(person)); // will NOT work when id changed on persist()
}
finally {
transactional(em::remove, person);
}
}

....
}

Look at line 11. The application puts a transient entity into a hash-container. Now look at line 17. This checks, after persist(), whether the persisted entity still is in the hash-container, using the contained() method. We know that, depending on how hashCode/equals was implemented, this assertion could break.

Sequence Number Primary Key

First let's try that unit test with following BaseEntity:

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.MappedSuperclass;
import javax.persistence.SequenceGenerator;

@MappedSuperclass
public abstract class BaseEntity
{
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "DatabaseGlobalSequence")
@SequenceGenerator(name = "DatabaseGlobalSequence", initialValue = 1, allocationSize = 10)
private Long id;

/** @return the primary key of this entity. */
public final Long getId() {
return id;
}

/** Overridden to delegate to class-equality and id (when not null). */
@Override
public final boolean equals(Object o) {
if (this == o) // performance optimization
return true;

if (o == null || getClass() != o.getClass()) // exclude aliens
return false; // and one-to-one entities with same id

final BaseEntity other = (BaseEntity) o;
if (id == null || other.id == null) // can't use id
return super.equals(o);

return id.equals(other.id); // delegate equality to id
}

/** Overridden to delegate to id when not null, else to super. */
@Override
public final int hashCode() {
return (id != null) ? id.hashCode() : super.hashCode();
}
}

This uses a database-generated sequence called "DatabaseGlobalSequence". The @SequenceGenerator annotation is referenced by the preceding @GeneratedValue(generator=...) annotation.

We see that the test fails in line 17. Although the entity still is inside the set, the contains() method returned false. This is because the hashcode changed on-the-fly, from null to the sequence-value generated by the JPA provider on persist().

UUID Primary Key

Now let's try the same with an UUID in BaseEntity implementation:

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import java.util.UUID;
import javax.persistence.Id;
import javax.persistence.MappedSuperclass;

@MappedSuperclass
public abstract class BaseEntity
{
@Id
private UUID id = UUID.randomUUID();

/** @return the primary key of this entity. */
public final UUID getId() {
return id;
}

/** Overridden to delegate to class-equality and id. */
@Override
public final boolean equals(Object o) {
if (this == o) // performance optimization
return true;

if (o == null || getClass() != o.getClass()) // exclude aliens
return false; // and one-to-one entities with same id

final BaseEntity other = (BaseEntity) o;
return id.equals(other.id); // delegate equality to id
}

/** Overridden to delegate to id. */
@Override
public final int hashCode() {
return id.hashCode();
}
}

We see that this implementation is shorter than the previous one. It doesn't have the @GeneratedValue annotation because the id value is set by the UUID.randomUUID() generator at the entity's construction time. Thus the hashCode/equals implementation is durable now, it relies on an id that is never null.

This time the test succeeds.

Conclusion

Databases are production assets. Every company tries to keep their persistence structures as long as possible unchanged, every modification is seen as critical. It absolutely matters how contents are identified and related. Best way may be to keep the type of the primary key flexible and isolated. Some even advice to use both SEQUENCE number and UUID together.





ɔ⃝ Fritz Ritzberger, 2019-12-29