Skip to main content

Data Architecture

The data architecture of this project is built on SQLModel, which provides a unified interface for defining both database tables (ORM models) and data transfer objects (Pydantic schemas). This approach reduces boilerplate by allowing a single class to serve multiple purposes while maintaining strict separation between internal storage and external API representations.

Unified Modeling with SQLModel

The project leverages SQLModel's ability to define a class as a database table using the table=True parameter. By inheriting from a shared base class, the system ensures consistency between the data validated at the API boundary and the data persisted in the database.

For example, in backend/app/models.py, the User entity is split into several classes to handle different stages of the data lifecycle:

# backend/app/models.py

# Shared properties used across all User-related schemas
class UserBase(SQLModel):
email: EmailStr = Field(unique=True, index=True, max_length=255)
is_active: bool = True
is_superuser: bool = False
full_name: str | None = Field(default=None, max_length=255)

# The actual database table
class User(UserBase, table=True):
id: uuid.UUID = Field(default_factory=uuid.uuid4, primary_key=True)
hashed_password: str
created_at: datetime | None = Field(
default_factory=get_datetime_utc,
sa_type=DateTime(timezone=True),
)
items: list["Item"] = Relationship(back_populates="owner", cascade_delete=True)

The Schema Lifecycle Pattern

The architecture follows a consistent pattern for every major entity (Users and Items), ensuring that sensitive data is never exposed and that input validation is strictly enforced.

  1. Base Class (EntityBase): Defines common fields shared by both the API and the database (e.g., email, title).
  2. Creation Schema (EntityCreate): Inherits from the Base and adds fields required only during creation, such as a raw password.
  3. Update Schema (EntityUpdate): Inherits from the Base but makes all fields optional, allowing for partial updates (PATCH requests).
  4. Public Schema (EntityPublic): Defines the structure of data returned to the client. It includes the id and excludes sensitive fields like hashed_password.
  5. Database Model (Entity): The class with table=True that maps to the database. It includes internal fields like hashed_password and defines relationships.

Implementation Example: Items

The Item entity demonstrates this pattern in backend/app/models.py:

class ItemBase(SQLModel):
title: str = Field(min_length=1, max_length=255)
description: str | None = Field(default=None, max_length=255)

class ItemCreate(ItemBase):
pass

class Item(ItemBase, table=True):
id: uuid.UUID = Field(default_factory=uuid.uuid4, primary_key=True)
owner_id: uuid.UUID = Field(
foreign_key="user.id", nullable=False, ondelete="CASCADE"
)
owner: User | None = Relationship(back_populates="items")

class ItemPublic(ItemBase):
id: uuid.UUID
owner_id: uuid.UUID

Relational Integrity and Constraints

The project uses SQLAlchemy-level constraints and SQLModel relationships to maintain data integrity:

  • UUID Primary Keys: All models use uuid.UUID for primary keys, generated via uuid.uuid4.
  • Foreign Keys and Cascades: The Item model defines a foreign key to the User table with ondelete="CASCADE". Correspondingly, the User model's relationship to items includes cascade_delete=True. This ensures that if a user is deleted, all their associated items are automatically removed by the database.
  • Timestamps: Models use a helper function get_datetime_utc to ensure all created_at fields are stored in UTC, using SQLAlchemy's DateTime(timezone=True) type for cross-database compatibility.

Security-First Data Handling

The transition between API schemas and database models is a critical point for security. The project handles this in the CRUD layer (backend/app/crud.py) using model_validate.

Password Hashing

When creating a user, the UserCreate schema (containing a plain-text password) is converted into a User model. The password is explicitly hashed before being stored:

# backend/app/crud.py

def create_user(*, session: Session, user_create: UserCreate) -> User:
db_obj = User.model_validate(
user_create, update={"hashed_password": get_password_hash(user_create.password)}
)
session.add(db_obj)
session.commit()
session.refresh(db_obj)
return db_obj

Timing Attack Prevention

The authentication logic includes a defense against timing attacks. If a user is not found by email, the system still performs a password verification against a DUMMY_HASH. This ensures that the response time of the API is consistent regardless of whether the email exists in the database, preventing attackers from enumerating valid usernames.

# backend/app/crud.py

DUMMY_HASH = "$argon2id$v=19$m=65536,t=3,p=4$MjQyZWE1MzBjYjJlZTI0Yw$YTU4NGM5ZTZmYjE2NzZlZjY0ZWY3ZGRkY2U2OWFjNjk"

def authenticate(*, session: Session, email: str, password: str) -> User | None:
db_user = get_user_by_email(session=session, email=email)
if not db_user:
verify_password(password, DUMMY_HASH)
return None
# ... proceed with actual verification

System-Level Schemas

Beyond core entities, the architecture includes utility schemas for system communication:

  • Message: A generic schema used for simple API responses, such as confirming a deletion.
  • Token and TokenPayload: Used for JWT-based authentication. Token defines the structure of the response after a successful login, while TokenPayload defines the expected contents of the decoded JWT (the sub or subject field).

These schemas are also defined in backend/app/models.py using SQLModel, maintaining the consistent use of Pydantic-based validation throughout the entire application.