ORM hate in 2024 | hirotomurai.com

What is ORM?

ORM stands for Object Relational Mapping. It involves parsing binary or string data retrieved from relational databases (commonly referred to as DB clients in practice) and repacking it into class instances for use within an application.

The fundamental reason for the necessity of ORM lies in the mismatch between the management of normalized data by RDBs and the denormalized data handled by applications, which consist of classes and their fields.

Expectations and Reality of ORM

It is often expected that ORM completely hides database-specific operations and automates the mapping process, while in reality, developers need to be aware of specific databases early in development. Additionally, to adhere to ORM conventions, application core logic (domain model) may become distorted, or separate classes and domain models may need to be written to avoid such distortion, resulting in the need to implement mappers twice. This leads to increased learning costs for ORM and the need to write redundant transformation layers, which defeats the purpose.

The below models are defined in SQLAlchemy. We have chosen Python, which is rarely chosen when implementing a core domain, because we want it to be seen without preconceptions. It’s a bit cluttered at first glance. What is Base? How should Mapped types be initialised and unwrapped?

ORM relieves us from the tedious transformation tasks but increases onboarding costs and decreases personnel mobility within development teams. Depending on the language and library used, it might visually emphasize storage information on the domain model, distracting developers from the problem domain.

class Base(DeclarativeBase):
    pass

class User(Base):
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(primary_key=True)
    name: Mapped[str] = mapped_column(String(30))
    fullname: Mapped[Optional[str]]

    addresses: Mapped[List["Address"]] = relationship(
        back_populates="user", cascade="all, delete-orphan"
    )

class Address(Base):
    __tablename__ = "address"

    id: Mapped[int] = mapped_column(primary_key=True)
    email_address: Mapped[str]
    user_id: Mapped[int] = mapped_column(ForeignKey("user_account.id"))

    user: Mapped["User"] = relationship(back_populates="addresses")

Implement ORM by hand

With the enhancement of built-in SQL functions, it is now possible to intuitively construct denormalized data using SQL alone. The processes that ORM used to substitute can be succinctly written from scratch using just a primitive DB adapter and serialization library (using PostgreSQL as RDB).

from typing import Optional, NewType
from dataclasses import dataclass
import psycopg3
import jsonpickle

Address = NewType('Address', str)
UserId = NewType('UserId', int)

@dataclass
class User:
    id: UserId
    name: str
    fullname: Optional[str]
    addresses: list["Address"]

class UserRepository:
    def __init__(self, conn):
        self.conn = conn

    def userById(self, id: UserId) -> Optional[User]:
        with self.conn.cursor() as cur:
            cur.execute("""
                SELECT json_build_object(
                    'id', id,
                    'name', name,
                    'fullname', fullname,
                    'addresses', addresses
                ) AS user_json
                FROM users WHERE id = %s
            """, (id,))

            result = cur.fetchone()
            if result is None:
                return None

            user_json = result[0]
            user = jsonpickle.decode(user_json)
            return user

Only primitive DB adapters and serialisation libraries are used here. You will probably be able to quickly understand what they do at a glance even if you are not familiar with the libraries.

Data access layers are written in SQL, allowing new engineers to quickly become productive. This low onboarding cost is particularly effective in environments with time constraints and high personnel turnover, such as startups. And the high portability of SQL, which is independent of programming languages, is a significant advantage. Also, the almost complete separation of domain models and DB layers allows for a division of labour.

As of 2024, most domain logic is framework-independent and simple in structure, as DDD and Clean Architecture are common practices. Therefore, application language migration can be confidently managed by developers if the data access layer is written in primitive SQL rather than an ORM.

As I mentioned earlier that “almost complete separation of domain models and DB layers”, most serialisation libraries have certain restrictions such as the need to annotate the object to be serialised or the field must be public. However, that restriction is not at the level of a problem in most of cases, especially the earlier stage of the product. especially in languages where pre-processors such as macros and annotations are provided. The following is an example of defining a serialisable object in Rust. For the simple case, just specify (De)Serialize for derive.

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let point = Point { x: 1, y: 2 };

    let serialized = serde_json::to_string(&point).unwrap();
    println!("serialized = {}", serialized);

    let deserialized: Point = serde_json::from_str(&serialized).unwrap();
    println!("deserialized = {:?}", deserialized);
}

// output:
// serialized = {"x":1,"y":2}
// deserialized = Point { x: 1, y: 2 }

Next example is Kotlin. The serialization library is provided as a official.

import kotlinx.serialization.Serializable
import kotlinx.serialization.json.Json
import kotlinx.serialization.decodeFromString

@Serializable
data class Data(val a: Int, val b: String)

fun main() {
   val obj = Json.decodeFromString<Data>("""{"a":42, "b": "str"}""")
}

Of course, it can go wrong as serialisation requirements become more complex, such as guarantees of invariant conditions, for example. The critical point is generally higher than when using ORM.

When to Use ORM

If there is a de facto standard ORM for a language, considering its use might be beneficial. Alternatively, in cases like Ruby on Rails where the ORM layer and framework are fully combined, leveraging it is certainly advisable. Ultimately, the decision to adopt an ORM library is more influenced by external factors such as onboarding costs, portability, and readability than by application code quality.

However, using ORM in microservices or lightweight frameworks might cancel out their respective advantages and should be avoided.

When You Have to Use ORM That Does Not Match Your Use Case

Sometimes you might be compelled to use an ORM that does not fit your use case due to various reasons. For instance, an application developed in Rails might need to separate Read and Write models as it grows. In such cases, defining separate objects for ORM mapping and models used in the domain layer, and implementing the transformation process on the ORM objects can prevent ORM from impacting the business logic.

Unaddressed Considerations

Detailed implementation and testing of mapping between DB models and domain models
Management of DB connections
Implementation of automated tests for the data access layer
Migrations