Design an Issue Tracker (Jira/Linear)

A comprehensive system design for an issue tracking and project management tool covering API design for dynamic workflows, efficient kanban board pagination, drag-and-drop ordering without full row updates, concurrent edit handling, and real-time synchronization. This design addresses the challenges of project-specific column configurations while maintaining consistent user-defined ordering across views.

High-level architecture: API gateway routes domain services, WebSocket sync fans out via Redis pub/sub.

Abstract

Issue tracking systems solve three interconnected problems: flexible workflows (each project defines its own statuses and transitions), efficient ordering (issues maintain user-defined positions without expensive reindexing), and concurrent editing (multiple users can update the same issue simultaneously).

Core architectural decisions:

Decision	Choice	Rationale
Ordering algorithm	Fractional indexing (LexoRank)	O(1) insertions without row updates
API style	GraphQL with REST fallback	Flexible field selection for varied board views
Pagination	Per-column cursor-based	Ensures all columns load incrementally
Concurrency	Optimistic locking with version field	Low conflict rate in practice
Real-time sync	WebSocket transaction stream + last-write-wins	Sub-200ms propagation, simple conflict model
Rich-text fields	CRDT (Yjs / Automerge) only on description / comments	Conflict-free concurrent editing where it actually matters
Workflow storage	Polymorphic per-project	Projects own their status definitions
Authorization	RBAC at project + ABAC overlay for issue visibility	Mirrors Jira’s project-role + issue-security split
Search	Postgres FTS for small tenants, OpenSearch at scale	Same query API; switch backend per tenant size
Attachments	S3-class object store + presigned multipart	Keep large blobs out of Postgres
Notifications	Event bus + per-channel queues + per-user digest	Independent retry / backpressure per channel

Key trade-offs accepted:

Denormalized board state in Redis for fast reads, with async consistency
LexoRank strings grow unbounded, requiring periodic rebalancing
Last-write-wins may lose concurrent edits (acceptable for most fields)

What this design optimizes:

Drag-and-drop reordering updates exactly one row
Board loads show issues across all columns immediately
Workflow changes don’t require schema migrations

Requirements

Functional Requirements

Requirement	Priority	Notes
Create/edit/delete issues	Core	Title, description, assignee, type, priority
Project-specific workflows	Core	Custom statuses and transitions per project
Kanban board view	Core	Drag-drop between columns and within columns
Issue ordering within columns	Core	Persist user-defined order
Real-time updates	Core	See changes from other users immediately
Search and filter	Core	Full-text search, JQL-style queries
Comments and activity	Extended	Threaded comments, activity timeline
Attachments	Extended	File upload and preview
Sprints/iterations	Extended	Time-boxed groupings
Custom fields	Extended	Project-specific metadata

Non-Functional Requirements

Requirement	Target	Rationale
Availability	99.9% (3 nines)	User-facing, productivity critical
Board load time	p99 < 500ms	Must feel instant
Issue update latency	p99 < 200ms	Drag-drop must be responsive
Real-time propagation	p99 < 300ms	Collaborative editing feel
Search latency	p99 < 100ms	Autocomplete responsiveness
Concurrent users per board	100	Team collaboration scenario

Scale Estimation

Users:

Total users: 10M (Jira-scale)
Daily Active Users (DAU): 2M (20%)
Peak concurrent users: 500K

Projects and Issues:

Projects: 1M
Issues per project (active): 1,000 avg, 100,000 max
Total issues: 1B
Issues per board view: 200-500 typical

Traffic:

Board loads: 2M DAU × 10 loads/day = 20M/day = ~230 RPS
Issue updates: 2M DAU × 20 updates/day = 40M/day = ~460 RPS
Peak multiplier: 3x → 700 RPS board loads, 1,400 RPS updates

Storage:

Issue size: 5KB avg (metadata + description)
Total issue storage: 1B × 5KB = 5TB
Attachments: 50TB (separate object storage)
Activity log: 20TB (append-only)

Design Paths

Path A: Server-Authoritative with REST API

Best when:

Team familiar with REST patterns
Simpler infrastructure requirements
Offline support not critical
Moderate real-time requirements

Architecture:

REST API request flow for issue moves: client patches the API, the API persists the change in Postgres and fans the event out through a WebSocket layer.

Trade-offs:

✅ Simple mental model
✅ Standard tooling and caching
✅ Easy to debug
❌ Over-fetching/under-fetching without careful design
❌ Multiple round trips for complex operations
❌ Real-time requires separate WebSocket layer

Real-world example: Jira Cloud exposes a REST API for issue and board operations and uses LexoRank for ordering (Jira Software Cloud REST API, Atlassian KB: LexoRank).

Path B: Local-First with Sync Engine

Best when:

Offline support is critical
Sub-100ms UI responsiveness required
Team can invest in sync infrastructure
Users on unreliable networks

Architecture:

Trade-offs:

✅ Instant UI response (local-first)
✅ Full offline support
✅ Minimal network traffic (deltas only)
❌ Complex sync logic
❌ Conflict resolution complexity
❌ Larger client-side footprint

Real-world example: Linear bootstraps a workspace into IndexedDB and a MobX-managed in-memory object graph, then keeps it in sync over a WebSocket transaction stream — letting the UI read and write locally with no network in the hot path (Scaling the Linear Sync Engine). Each server-acknowledged write bumps a workspace-wide lastSyncId; clients use it as a cursor to ask for missed deltas after a reconnect. The sync model is last-write-wins for scalar fields, with CRDTs reserved for rich-text issue descriptions (reverse-linear-sync-engine).

Path C: GraphQL with Optimistic Updates

Best when:

Varied client needs (web, mobile, integrations)
Complex data relationships
Need flexibility without over-fetching
Subscriptions for real-time

Architecture:

1mutation MoveIssue($input: MoveIssueInput!) {2  moveIssue(input: $input) {3    issue {4      id5      status {6        id7        name8      }9      rank10      updatedAt11    }12  }13}1415subscription OnBoardUpdate($boardId: ID!) {16  boardUpdated(boardId: $boardId) {17    issue {18      id19      status {20        id21      }22      rank23    }24    action25  }26}

Trade-offs:

✅ Flexible queries for different views
✅ Built-in subscriptions for real-time
✅ Single endpoint simplifies client
❌ Caching more complex
❌ Rate limiting harder
❌ Learning curve for teams

Real-world example: Linear’s public API is GraphQL-only and is the same API its web and desktop clients use (Linear GraphQL API). GitHub also exposes its issue and project surface via GraphQL (GitHub GraphQL API).

Path Comparison

Factor	REST	Local-First	GraphQL
Implementation complexity	Low	High	Medium
UI responsiveness	Medium	Excellent	Good
Offline support	Limited	Native	Limited
Client flexibility	Low	Low	High
Real-time complexity	Separate	Built-in	Built-in
Caching	Simple	Complex	Medium

This Article’s Focus

This article focuses on Path C (GraphQL with REST fallback) because:

Flexible field selection suits varied board configurations
Subscriptions provide native real-time support
REST endpoints can coexist for webhooks and simple integrations
It matches what modern issue trackers expose externally — Linear’s API is GraphQL-only, and GitHub Issues / Projects ship a GraphQL surface alongside REST

High-Level Design

Component Overview

Service decomposition: GraphQL/REST/WebSocket fronting Issue, Project, Workflow, Board, Search, and Activity services on Postgres, Redis, Elasticsearch, and Kafka.

Issue Service

Handles core issue CRUD operations and ordering.

Responsibilities:

Create, read, update, delete issues
Rank calculation for ordering
Status transitions with workflow validation
Optimistic locking for concurrent updates

Key design decisions:

Decision	Choice	Rationale
Primary key	UUID	Distributed ID generation, no coordination
Ordering	LexoRank string	O(1) reordering without cascading updates
Versioning	Monotonic version field	Optimistic locking for concurrent edits

Project Service

Manages project configuration including workflows.

Responsibilities:

Project CRUD
Workflow definition per project
Status and transition management
Board configuration (columns, filters)

Design decision: Each project owns its workflow definition. Statuses are project-scoped, not global. This allows teams to customize without affecting others.

Board Service

Optimizes board view queries by maintaining denormalized state.

Responsibilities:

Cache board state in Redis
Compute issue counts per column
Handle board-level operations (collapse column, set WIP limits)

Why separate service: Board queries require joining issues, statuses, and users. Denormalizing into Redis achieves sub-50ms board loads.

Workflow Service

Enforces workflow rules and transitions.

Responsibilities:

Validate status transitions
Execute transition side effects (webhooks, automations)
Maintain workflow history

Transition validation flow:

Workflow transition validation: Issue Service asks Workflow Service whether the proposed status change is allowed before persisting the update.

API Design

GraphQL Schema (Core Types)

1type Issue {2  id: ID!3  key: String! # e.g., "PROJ-123"4  title: String!5  description: String6  status: Status!7  assignee: User8  reporter: User!9  priority: Priority!10  issueType: IssueType!11  rank: String! # LexoRank for ordering12  version: Int! # Optimistic locking13  project: Project!14  comments(first: Int, after: String): CommentConnection!15  activity(first: Int, after: String): ActivityConnection!16  createdAt: DateTime!17  updatedAt: DateTime!18}1920type Status {21  id: ID!22  name: String!23  category: StatusCategory! # TODO, IN_PROGRESS, DONE24  color: String!25  position: Int! # Column order26}2728type Project {29  id: ID!30  key: String!31  name: String!32  workflow: Workflow!33  statuses: [Status!]!34  issueTypes: [IssueType!]!35}3637type Workflow {38  id: ID!39  name: String!40  statuses: [Status!]!41  transitions: [Transition!]!42}4344type Transition {45  id: ID!46  name: String!47  fromStatus: Status48  toStatus: Status!49  conditions: [TransitionCondition!]50}5152enum StatusCategory {53  TODO54  IN_PROGRESS55  DONE56}5758enum Priority {59  LOWEST60  LOW61  MEDIUM62  HIGH63  HIGHEST64}

Board Query with Per-Column Pagination

The key challenge: fetch issues across multiple columns where each column can have different numbers of issues.

Naive approach (problematic):

1# BAD: Fetches all issues, client groups by status2query {3  issues(projectId: "proj-1", first: 100) {4    nodes {5      id6      status {7        id8      }9    }10  }11}12# Problem: If 90 issues are in "To Do", other columns appear empty

Per-column pagination approach:

1type BoardColumn {2  status: Status!3  issues(first: Int!, after: String): IssueConnection!4  totalCount: Int!5}67type Board {8  id: ID!9  project: Project!10  columns: [BoardColumn!]!11}1213query GetBoard($projectId: ID!, $issuesPerColumn: Int!) {14  board(projectId: $projectId) {15    columns {16      status {17        id18        name19        color20      }21      totalCount22      issues(first: $issuesPerColumn) {23        nodes {24          id25          key26          title27          assignee {28            id29            name30            avatar31          }32          priority33          rank34        }35        pageInfo {36          hasNextPage37          endCursor38        }39      }40    }41  }42}

Response structure:

1{2  "data": {3    "board": {4      "columns": [5        {6          "status": { "id": "status-1", "name": "To Do", "color": "#grey" },7          "totalCount": 45,8          "issues": {9            "nodes": [10              /* first 20 issues */11            ],12            "pageInfo": { "hasNextPage": true, "endCursor": "cursor-abc" }13          }14        },15        {16          "status": { "id": "status-2", "name": "In Progress", "color": "#blue" },17          "totalCount": 12,18          "issues": {19            "nodes": [20              /* first 12 issues - no more pages */21            ],22            "pageInfo": { "hasNextPage": false, "endCursor": "cursor-xyz" }23          }24        },25        {26          "status": { "id": "status-3", "name": "Done", "color": "#green" },27          "totalCount": 89,28          "issues": {29            "nodes": [30              /* first 20 issues */31            ],32            "pageInfo": { "hasNextPage": true, "endCursor": "cursor-def" }33          }34        }35      ]36    }37  }38}

Load more for specific column:

1query LoadMoreIssues($statusId: ID!, $after: String!) {2  column(statusId: $statusId) {3    issues(first: 20, after: $after) {4      nodes {5        id6        key7        title8        rank9      }10      pageInfo {11        hasNextPage12        endCursor13      }14    }15  }16}

Issue Mutations

Move Issue (status change + reorder):

1input MoveIssueInput {2  issueId: ID!3  toStatusId: ID!4  rankAfterId: ID # Issue to position after (null = top)5  rankBeforeId: ID # Issue to position before (null = bottom)6  version: Int! # For optimistic locking7}89type MoveIssuePayload {10  issue: Issue11  error: MoveIssueError12}1314type MoveIssueError {15  code: MoveIssueErrorCode!16  message: String!17}1819enum MoveIssueErrorCode {20  ISSUE_NOT_FOUND21  INVALID_TRANSITION22  VERSION_CONFLICT23  PERMISSION_DENIED24}2526mutation MoveIssue($input: MoveIssueInput!) {27  moveIssue(input: $input) {28    issue {29      id30      status {31        id32        name33      }34      rank35      version36      updatedAt37    }38    error {39      code40      message41    }42  }43}

Update Issue:

1input UpdateIssueInput {2  issueId: ID!3  title: String4  description: String5  assigneeId: ID6  priority: Priority7  version: Int!8}910mutation UpdateIssue($input: UpdateIssueInput!) {11  updateIssue(input: $input) {12    issue {13      id14      title15      description16      assignee {17        id18        name19      }20      priority21      version22      updatedAt23    }24    error {25      code26      message27    }28  }29}

Real-time Subscriptions

1type BoardEvent {2  issue: Issue!3  action: BoardAction!4  previousStatusId: ID # For status changes5  previousRank: String # For reorders6}78enum BoardAction {9  CREATED10  UPDATED11  MOVED12  DELETED13}1415subscription OnBoardChange($projectId: ID!) {16  boardChanged(projectId: $projectId) {17    issue {18      id19      key20      title21      status {22        id23      }24      rank25      assignee {26        id27        name28      }29      version30    }31    action32    previousStatusId33  }34}

REST API Fallback

For webhooks and simple integrations:

Move Issue:

1PATCH /api/v1/issues/{issueId}/move2Content-Type: application/json3If-Match: "version-5"45{6  "statusId": "status-3",7  "rankAfterId": "issue-456",8  "rankBeforeId": null9}

Response:

1HTTP/1.1 200 OK2ETag: "version-6"34{5  "id": "issue-123",6  "key": "PROJ-123",7  "status": { "id": "status-3", "name": "Done" },8  "rank": "0|i002bc",9  "version": 6,10  "updatedAt": "2024-02-03T10:00:00Z"11}

Error Responses:

Code	Error	When
400	`INVALID_TRANSITION`	Workflow doesn’t allow this status change
404	`NOT_FOUND`	Issue or target status doesn’t exist
409	`VERSION_CONFLICT`	Version mismatch (concurrent edit)
412	`PRECONDITION_FAILED`	ETag mismatch

Data Modeling

Core Schema (PostgreSQL)

1-- Projects with embedded workflow reference2CREATE TABLE projects (3    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),4    key VARCHAR(10) UNIQUE NOT NULL,      -- e.g., "PROJ"5    name VARCHAR(255) NOT NULL,6    description TEXT,7    owner_id UUID NOT NULL REFERENCES users(id),8    created_at TIMESTAMPTZ DEFAULT NOW(),9    updated_at TIMESTAMPTZ DEFAULT NOW()10);1112-- Statuses are project-scoped13CREATE TABLE statuses (14    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),15    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,16    name VARCHAR(100) NOT NULL,17    category VARCHAR(20) NOT NULL,        -- 'todo', 'in_progress', 'done'18    color VARCHAR(7) DEFAULT '#808080',19    position INT NOT NULL,                -- Column order20    is_initial BOOLEAN DEFAULT FALSE,     -- Default for new issues21    UNIQUE (project_id, name)22);2324CREATE INDEX idx_statuses_project ON statuses(project_id, position);2526-- Workflow transitions define allowed status changes27CREATE TABLE workflow_transitions (28    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),29    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,30    from_status_id UUID REFERENCES statuses(id) ON DELETE CASCADE,  -- NULL = any31    to_status_id UUID NOT NULL REFERENCES statuses(id) ON DELETE CASCADE,32    name VARCHAR(100) NOT NULL,33    opsbar_sequence INT DEFAULT 10,       -- UI ordering34    UNIQUE (project_id, from_status_id, to_status_id)35);3637-- Issue types (Epic, Story, Task, Bug)38CREATE TABLE issue_types (39    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),40    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,41    name VARCHAR(50) NOT NULL,42    icon VARCHAR(50),43    color VARCHAR(7),44    UNIQUE (project_id, name)45);4647-- Issues with LexoRank ordering48CREATE TABLE issues (49    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),50    project_id UUID NOT NULL REFERENCES projects(id),51    issue_type_id UUID NOT NULL REFERENCES issue_types(id),52    status_id UUID NOT NULL REFERENCES statuses(id),5354    -- Issue key: computed from project key + sequence55    issue_number INT NOT NULL,5657    title VARCHAR(500) NOT NULL,58    description TEXT,5960    assignee_id UUID REFERENCES users(id),61    reporter_id UUID NOT NULL REFERENCES users(id),6263    priority VARCHAR(20) DEFAULT 'medium',6465    -- LexoRank for ordering within status66    -- Format: "0|hzzzzz" (bucket | alphanumeric)67    rank VARCHAR(255) NOT NULL,6869    -- Optimistic locking70    version INT DEFAULT 1,7172    created_at TIMESTAMPTZ DEFAULT NOW(),73    updated_at TIMESTAMPTZ DEFAULT NOW(),7475    UNIQUE (project_id, issue_number)76);7778-- Primary query: issues by status, ordered by rank79CREATE INDEX idx_issues_board ON issues(project_id, status_id, rank);8081-- Secondary: issues by assignee82CREATE INDEX idx_issues_assignee ON issues(assignee_id, updated_at DESC);8384-- Issue key lookup85CREATE INDEX idx_issues_key ON issues(project_id, issue_number);8687-- Comments88CREATE TABLE comments (89    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),90    issue_id UUID NOT NULL REFERENCES issues(id) ON DELETE CASCADE,91    author_id UUID NOT NULL REFERENCES users(id),92    body TEXT NOT NULL,93    created_at TIMESTAMPTZ DEFAULT NOW(),94    updated_at TIMESTAMPTZ DEFAULT NOW()95);9697CREATE INDEX idx_comments_issue ON comments(issue_id, created_at);9899-- Activity log (append-only)100CREATE TABLE activity_log (101    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),102    issue_id UUID NOT NULL REFERENCES issues(id) ON DELETE CASCADE,103    user_id UUID NOT NULL REFERENCES users(id),104    action_type VARCHAR(50) NOT NULL,     -- 'status_change', 'assignment', etc.105    old_value JSONB,106    new_value JSONB,107    created_at TIMESTAMPTZ DEFAULT NOW()108);109110CREATE INDEX idx_activity_issue ON activity_log(issue_id, created_at DESC);

Database Selection Rationale

Data Type	Store	Rationale
Issues, Projects	PostgreSQL	ACID, complex queries, JOIN capability
Board cache	Redis	Sub-ms reads, TTL for staleness
Search index	Elasticsearch	Full-text search, faceted filtering
Activity log	PostgreSQL → Kafka	Append-only, stream processing
Attachments	S3	Cost-effective blob storage

Denormalized Board Cache (Redis)

Why cache: Board queries join issues, statuses, and users. Caching avoids expensive JOINs on every load.

Structure:

1# Board metadata2HSET board:{project_id}:meta3    columns_json "[{\"status_id\":\"s1\",\"name\":\"To Do\"}...]"4    total_issues 1565    last_updated 170688640000067# Per-column issue list (sorted set by rank)8ZADD board:{project_id}:column:{status_id} {rank_score} {issue_id}910# Issue card data (hash - denormalized for fast read)11HSET issue:{issue_id}:card12    key "PROJ-123"13    title "Implement login"14    status_id "status-2"15    assignee_id "user-456"16    assignee_name "Alice"17    priority "high"18    rank "0|i000ab"19    version 5

Cache invalidation strategy:

Write-through: Update cache immediately after DB write
TTL: 5 minutes as safety net
Pub/Sub: Broadcast invalidation to all service instances

Low-Level Design: LexoRank Ordering

Why LexoRank?

Traditional integer-based ordering has a fundamental problem:

1Before: [A:1, B:2, C:3, D:4]2Insert X between B and C:3After:  [A:1, B:2, X:3, C:4, D:5]  ← Must update C, D

With N items and frequent reorders, this is O(N) updates per operation.

Fractional indexing solution: Use lexicographically sortable strings where you can always find a value between any two existing values, so an insert only writes the moved row’s rank — siblings are untouched. Figma uses the same idea, with arbitrary-precision base-95 fractions stored as strings, for ordering children inside a frame (Figma — Realtime Editing of Ordered Sequences).

1Before: [A:"aaa", B:"bbb", C:"ccc"]2Insert X between B and C:3After:  [A:"aaa", B:"bbb", X:"bbc", C:"ccc"]  ← Only X updated

LexoRank Format

Jira’s LexoRank uses the format bucket|value, where the bucket is a single digit and the value is a base-36 alphanumeric string (Atlassian KB: LexoRank):

10|hzzzzz2│ └─ Alphanumeric value (base-36, "0"–"9" + "a"–"z")3└── Bucket (0, 1, or 2)

Note

Production Jira ranks also carry a sub-rank after a : separator (for example 0|hzzzzz:), used to disambiguate concurrent inserts. The illustrations below collapse that detail; treat the value segment as the LexoRank “core” you would actually compute against.

Bucket rotation: The three buckets exist to support background rebalancing without taking writes offline. The balancer copies issues from the current bucket to the next one in the round-robin (0 → 1 → 2 → 0); new inserts can keep ranking against the source bucket while in-flight rows fan out to the destination (LexoRankBalanceOperation API).

Rank Calculation Algorithm

1// Simplified LexoRank implementation2const LEXORANK_CHARS = "0123456789abcdefghijklmnopqrstuvwxyz"3const BASE = LEXORANK_CHARS.length // 3645interface LexoRank {6  bucket: number7  value: string8}910function parseLexoRank(rank: string): LexoRank {11  const [bucket, value] = rank.split("|")12  return { bucket: parseInt(bucket), value }13}1415function formatLexoRank(rank: LexoRank): string {16  return `${rank.bucket}|${rank.value}`17}1819function getMidpoint(a: string, b: string): string {20  // Ensure same length by padding with '0's21  const maxLen = Math.max(a.length, b.length)22  const aPadded = a.padEnd(maxLen, "0")23  const bPadded = b.padEnd(maxLen, "0")2425  // Convert to numbers (treating as base-36)26  let result = ""27  let carry = 02829  for (let i = maxLen - 1; i >= 0; i--) {30    const aVal = LEXORANK_CHARS.indexOf(aPadded[i])31    const bVal = LEXORANK_CHARS.indexOf(bPadded[i])32    const sum = aVal + bVal + carry33    const mid = Math.floor(sum / 2)34    carry = sum % 235    result = LEXORANK_CHARS[mid] + result36  }3738  // If a and b are adjacent, extend with midpoint39  if (result === aPadded) {40    result += LEXORANK_CHARS[Math.floor(BASE / 2)] // 'i'41  }4243  return result.replace(/0+$/, "") // Trim trailing zeros44}4546function calculateNewRank(before: string | null, after: string | null, bucket: number = 0): string {47  if (!before && !after) {48    // First item - use middle of range49    return formatLexoRank({ bucket, value: "i" })50  }5152  if (!before) {53    // Insert at top - find value before 'after'54    const afterRank = parseLexoRank(after!)55    const newValue = getMidpoint("0", afterRank.value)56    return formatLexoRank({ bucket, value: newValue })57  }5859  if (!after) {60    // Insert at bottom - find value after 'before'61    const beforeRank = parseLexoRank(before)62    const newValue = getMidpoint(beforeRank.value, "z")63    return formatLexoRank({ bucket, value: newValue })64  }6566  // Insert between two items67  const beforeRank = parseLexoRank(before)68  const afterRank = parseLexoRank(after)69  const newValue = getMidpoint(beforeRank.value, afterRank.value)70  return formatLexoRank({ bucket, value: newValue })71}

Rebalancing Strategy

LexoRank strings grow whenever you keep inserting between two adjacent values:

1Initial:  "i"2After 1:  "ii"3After 2:  "iii"4...5After 50: "iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii"

Jira’s rebalancing thresholds (8.9.0+, per Atlassian KB) (source):

Max rank length	Action
128–159 characters	Rebalance is scheduled to run within 12 hours.
160–253 characters	Rebalance starts immediately.
≥ 254 characters	Rebalance starts immediately; ranking still works, but any operation whose target rank would itself exceed 254 characters fails until normalisation completes.

Caution

The pre-8.9 behaviour was different: the immediate trigger fired at 200 characters and ranking was disabled past that. Older “blog wisdom” floating around the internet still cites those numbers — verify against the Atlassian KB before turning them into runbook thresholds.

Rebalancing algorithm:

1async function rebalanceColumn(projectId: string, statusId: string): Promise<void> {2  // 1. Lock column for writes (or use different bucket)3  const lockKey = `rebalance:${projectId}:${statusId}`4  await redis.set(lockKey, "1", "EX", 300) // 5 min lock56  try {7    // 2. Fetch all issues ordered by current rank8    const issues = await db.query(9      `10      SELECT id, rank11      FROM issues12      WHERE project_id = $1 AND status_id = $213      ORDER BY rank14    `,15      [projectId, statusId],16    )1718    // 3. Assign evenly-spaced new ranks19    const newBucket = (parseInt(issues[0]?.rank?.split("|")[0] || "0") + 1) % 320    const step = Math.floor(BASE / (issues.length + 1))2122    const updates = issues.map((issue, index) => {23      const position = step * (index + 1)24      const newValue = position.toString(36).padStart(6, "0")25      return {26        id: issue.id,27        newRank: `${newBucket}|${newValue}`,28      }29    })3031    // 4. Batch update32    await db.transaction(async (tx) => {33      for (const { id, newRank } of updates) {34        await tx.query("UPDATE issues SET rank = $1 WHERE id = $2", [newRank, id])35      }36    })3738    // 5. Invalidate cache39    await invalidateBoardCache(projectId)40  } finally {41    await redis.del(lockKey)42  }43}

Low-Level Design: Concurrent Edit Handling

Optimistic Locking Flow

Optimistic locking with version field: two clients load version 5; the first write succeeds, the second collides on a version-conditional UPDATE and must refetch.

Implementation

1interface UpdateIssueInput {2  issueId: string3  title?: string4  description?: string5  assigneeId?: string6  version: number7}89interface UpdateResult {10  success: boolean11  issue?: Issue12  error?: { code: string; message: string }13}1415async function updateIssue(input: UpdateIssueInput): Promise<UpdateResult> {16  const { issueId, version, ...updates } = input1718  // Build dynamic UPDATE query19  const setClause = Object.entries(updates)20    .filter(([_, v]) => v !== undefined)21    .map(([k, _], i) => `${toSnakeCase(k)} = $${i + 3}`)22    .join(", ")2324  const values = Object.values(updates).filter((v) => v !== undefined)2526  const result = await db.query(27    `28    UPDATE issues29    SET ${setClause}, version = version + 1, updated_at = NOW()30    WHERE id = $1 AND version = $231    RETURNING *32  `,33    [issueId, version, ...values],34  )3536  if (result.rowCount === 0) {37    // Check if issue exists38    const exists = await db.query("SELECT version FROM issues WHERE id = $1", [issueId])3940    if (exists.rowCount === 0) {41      return {42        success: false,43        error: { code: "NOT_FOUND", message: "Issue not found" },44      }45    }4647    const currentVersion = exists.rows[0].version48    return {49      success: false,50      error: {51        code: "VERSION_CONFLICT",52        message: `Version mismatch. Expected ${version}, current is ${currentVersion}`,53      },54    }55  }5657  // Broadcast change58  await publishBoardEvent(result.rows[0].project_id, {59    action: "UPDATED",60    issue: result.rows[0],61  })6263  return { success: true, issue: result.rows[0] }64}

Conflict Resolution Strategies

Strategy	Use Case	Trade-off
Last-Write-Wins	Most fields (title, assignee, priority)	May lose edits, but simple
Field-Level Merge	Non-conflicting field updates	More complex, preserves more
Manual Resolution	Description (rich text)	Best fidelity, worst UX
CRDT	Concurrent rich text editing	Complex, best for collaboration

Field-level merge example:

1// Client 1 updates title (version 5 → 6)2// Client 2 updates assignee (version 5 → conflict)3// Instead of rejecting, merge if fields don't overlap45async function mergeUpdate(input: UpdateIssueInput, currentIssue: Issue): Promise<UpdateResult> {6  const { version, ...updates } = input78  // Find which fields changed since client's version9  const changedFields = await getChangedFieldsSince(input.issueId, version, currentIssue.version)1011  // Check for conflicts12  const conflictingFields = Object.keys(updates).filter((f) => changedFields.includes(f))1314  if (conflictingFields.length > 0) {15    return {16      success: false,17      error: {18        code: "FIELD_CONFLICT",19        message: `Conflicting fields: ${conflictingFields.join(", ")}`,20      },21    }22  }2324  // No conflicts - apply update to latest version25  return updateIssue({26    ...input,27    version: currentIssue.version,28  })29}

Move Operation (Status + Rank)

Moving an issue involves two atomic changes: status and rank.

1interface MoveIssueInput {2  issueId: string3  toStatusId: string4  rankAfterId?: string5  rankBeforeId?: string6  version: number7}89async function moveIssue(input: MoveIssueInput): Promise<UpdateResult> {10  const { issueId, toStatusId, rankAfterId, rankBeforeId, version } = input1112  return db.transaction(async (tx) => {13    // 1. Lock and fetch current issue14    const issue = await tx.query("SELECT * FROM issues WHERE id = $1 FOR UPDATE", [issueId])1516    if (!issue.rows[0]) {17      return { success: false, error: { code: "NOT_FOUND", message: "Issue not found" } }18    }1920    if (issue.rows[0].version !== version) {21      return {22        success: false,23        error: { code: "VERSION_CONFLICT", message: "Concurrent modification" },24      }25    }2627    const currentIssue = issue.rows[0]2829    // 2. Validate transition30    const transitionValid = await validateTransition(tx, currentIssue.project_id, currentIssue.status_id, toStatusId)3132    if (!transitionValid) {33      return {34        success: false,35        error: { code: "INVALID_TRANSITION", message: "Workflow does not allow this transition" },36      }37    }3839    // 3. Calculate new rank40    let newRank: string4142    if (rankAfterId) {43      const afterIssue = await tx.query("SELECT rank FROM issues WHERE id = $1", [rankAfterId])44      const beforeIssue = rankBeforeId ? await tx.query("SELECT rank FROM issues WHERE id = $1", [rankBeforeId]) : null4546      newRank = calculateNewRank(afterIssue.rows[0]?.rank, beforeIssue?.rows[0]?.rank)47    } else if (rankBeforeId) {48      const beforeIssue = await tx.query("SELECT rank FROM issues WHERE id = $1", [rankBeforeId])49      newRank = calculateNewRank(null, beforeIssue.rows[0]?.rank)50    } else {51      // Default: bottom of column52      const lastInColumn = await tx.query(53        `54        SELECT rank FROM issues55        WHERE project_id = $1 AND status_id = $256        ORDER BY rank DESC LIMIT 157      `,58        [currentIssue.project_id, toStatusId],59      )6061      newRank = calculateNewRank(lastInColumn.rows[0]?.rank, null)62    }6364    // 4. Update issue65    const result = await tx.query(66      `67      UPDATE issues68      SET status_id = $1, rank = $2, version = version + 1, updated_at = NOW()69      WHERE id = $370      RETURNING *71    `,72      [toStatusId, newRank, issueId],73    )7475    // 5. Log activity76    await tx.query(77      `78      INSERT INTO activity_log (issue_id, user_id, action_type, old_value, new_value)79      VALUES ($1, $2, 'status_change', $3, $4)80    `,81      [82        issueId,83        getCurrentUserId(),84        JSON.stringify({ status_id: currentIssue.status_id }),85        JSON.stringify({ status_id: toStatusId }),86      ],87    )8889    // 6. Broadcast (after commit)90    setImmediate(() => {91      publishBoardEvent(currentIssue.project_id, {92        action: "MOVED",93        issue: result.rows[0],94        previousStatusId: currentIssue.status_id,95      })96    })9798    return { success: true, issue: result.rows[0] }99  })100}

Low-Level Design: Workflow and Status Management

Workflow Data Model

Each project has its own workflow, defined by statuses and transitions.

Fetching Workflow Configuration

1query GetProjectWorkflow($projectId: ID!) {2  project(id: $projectId) {3    workflow {4      statuses {5        id6        name7        category8        color9        position10      }11      transitions {12        id13        name14        fromStatus {15          id16        }17        toStatus {18          id19        }20      }21    }22  }23}

Response structure:

1{2  "project": {3    "workflow": {4      "statuses": [5        { "id": "s1", "name": "To Do", "category": "TODO", "color": "#808080", "position": 1 },6        { "id": "s2", "name": "In Progress", "category": "IN_PROGRESS", "color": "#0052cc", "position": 2 },7        { "id": "s3", "name": "In Review", "category": "IN_PROGRESS", "color": "#8777d9", "position": 3 },8        { "id": "s4", "name": "Done", "category": "DONE", "color": "#36b37e", "position": 4 }9      ],10      "transitions": [11        { "id": "t1", "name": "Start Progress", "fromStatus": { "id": "s1" }, "toStatus": { "id": "s2" } },12        { "id": "t2", "name": "Submit for Review", "fromStatus": { "id": "s2" }, "toStatus": { "id": "s3" } },13        { "id": "t3", "name": "Approve", "fromStatus": { "id": "s3" }, "toStatus": { "id": "s4" } },14        { "id": "t4", "name": "Reject", "fromStatus": { "id": "s3" }, "toStatus": { "id": "s2" } },15        { "id": "t5", "name": "Reopen", "fromStatus": { "id": "s4" }, "toStatus": { "id": "s1" } }16      ]17    }18  }19}

Workflow Mutation API

1# Add a new status2mutation AddStatus($input: AddStatusInput!) {3  addStatus(input: $input) {4    status {5      id6      name7      category8      position9    }10  }11}1213# Add a transition14mutation AddTransition($input: AddTransitionInput!) {15  addTransition(input: $input) {16    transition {17      id18      name19      fromStatus {20        id21      }22      toStatus {23        id24      }25    }26  }27}2829# Reorder statuses (columns)30mutation ReorderStatuses($input: ReorderStatusesInput!) {31  reorderStatuses(input: $input) {32    statuses {33      id34      position35    }36  }37}

Client-Side Workflow Validation

To provide instant feedback, clients cache workflow rules:

1interface WorkflowCache {2  statuses: Map<string, Status>3  transitions: Map<string, Set<string>> // fromStatusId → Set<toStatusId>4}56class WorkflowValidator {7  private cache: WorkflowCache89  constructor(workflow: Workflow) {10    this.cache = {11      statuses: new Map(workflow.statuses.map((s) => [s.id, s])),12      transitions: new Map(),13    }1415    // Build transition map16    for (const t of workflow.transitions) {17      const fromId = t.fromStatus?.id || "*" // null = any status18      if (!this.cache.transitions.has(fromId)) {19        this.cache.transitions.set(fromId, new Set())20      }21      this.cache.transitions.get(fromId)!.add(t.toStatus.id)22    }23  }2425  canTransition(fromStatusId: string, toStatusId: string): boolean {26    // Check specific transition27    if (this.cache.transitions.get(fromStatusId)?.has(toStatusId)) {28      return true29    }30    // Check wildcard (from any status)31    if (this.cache.transitions.get("*")?.has(toStatusId)) {32      return true33    }34    return false35  }3637  getAvailableTransitions(fromStatusId: string): Status[] {38    const specific = this.cache.transitions.get(fromStatusId) || new Set()39    const wildcard = this.cache.transitions.get("*") || new Set()40    const available = new Set([...specific, ...wildcard])4142    return Array.from(available)43      .map((id) => this.cache.statuses.get(id)!)44      .filter(Boolean)45  }46}

Low-Level Design: Sync Engine and Offline Reconciliation

Path C above describes the GraphQL story; this section captures what changes when the same product needs Linear-grade local-first behaviour and offline edits. The mechanism is independent of the wire protocol — it works equally well over GraphQL subscriptions or a raw WebSocket transaction stream.

Data plane

Every workspace has a single monotonically increasing lastSyncId. The server bumps it on each persisted mutation and stamps the resulting delta packet with the new value before fanning it out to subscribers. Clients persist lastSyncId alongside the local model store so they can resume where they left off (Scaling the Linear Sync Engine, reverse-linear-sync-engine).

Sync engine flow: local mutations enqueue transactions, the server assigns a sync ID, and delta packets are fanned out and applied to the in-memory pool.

Three bootstrap modes hydrate the local store on app start (reverse-linear-sync-engine):

Bootstrap	When	Payload
Full	First load on a device	Full set of hot models (Issue, Project, User, Cycle)
Partial	Returning user with cached state but missed range	`lastSyncId` cursor + deferred models (Comment, History)
Local	Subsequent in-session loads	Hydrate from IndexedDB only; no network until first write

Mutation lifecycle

A local edit follows a fixed lifecycle:

UI calls a mutator on the in-memory model. The change is applied to the MobX object pool optimistically — the UI re-renders with no network in the hot path.
A Transaction record {op, entity, fields, baseSyncId, clientId} is appended to the local queue and persisted to IndexedDB.
The sync client streams pending transactions over the WebSocket. When the server acks {txId, lastSyncId=N}, the client drops the transaction from the queue.
The server fans the resulting delta packet out to every other connected client; remote clients apply the SyncAction set to their pool and bump their lastSyncId.

If the device is offline, steps 3–4 are deferred. Transactions stay in IndexedDB until the WebSocket reconnects.

Reconnect and rebase

On reconnect, the client cannot just replay the queue against the live server — the workspace may have advanced. The Linear-style protocol is:

11. open WebSocket, send {lastSyncId: baseSyncId}22. server streams missed delta packets up to current lastSyncId33. client applies them to the pool — server state is now caught up44. for each queued Tx: rebase fields against the new base, re-apply55. flush queued Tx; server acks normally

Rebasing is field-level last-write-wins: if the server already moved assignee while the client was offline, the client’s pending assignee write replaces it on reconnect. For free-form text (issue description, comment body) the rebase step instead hands off to a CRDT (Yjs / Automerge), which merges concurrent inserts without losing characters¹.

Idempotency and exactly-once delivery

The transaction queue is the only retry source, so every mutation needs an idempotency key:

Each Transaction carries {clientId, clientTxSeq}. The server stores the last applied clientTxSeq per clientId and rejects re-deliveries silently with the original lastSyncId.
Each delta packet carries its lastSyncId. Clients drop packets whose lastSyncId <= localLastSyncId — natural deduplication on reconnect storms.
HTTP fallbacks (file uploads, third-party integrations) use an Idempotency-Key header per the Stripe pattern².

Why not vector / hybrid logical clocks?

Vector clocks correctly capture concurrency but cost O(N) per write where N is the number of replicas — not viable for a workspace with 100k clients. Hybrid Logical Clocks (HLC) bound that cost but still require multiple participants to agree on causality at write time³. A single server-assigned lastSyncId is the cheapest correct choice for issue trackers, where conflicts between two humans editing the same field within milliseconds are rare in practice.

Important

The sync engine is the single point that writes are serialised through. Sharding it (per workspace, never per entity) is fine; splitting it across entities inside a workspace breaks the global ordering guarantee that makes LWW safe.

Low-Level Design: Permissions and Issue-Level Security

Issue trackers consistently land on a hybrid model: RBAC for project-scoped operations, ABAC-style overlays for per-issue visibility. Jira’s three layers are the canonical reference⁴:

Layer	Granted to	Examples
Global	Users / groups	`SYS_ADMIN`, `BROWSE_USERS`
Project	Project roles via permission scheme	`BROWSE_PROJECTS`, `CREATE_ISSUES`, `EDIT_ISSUES`
Issue-level	Roles / groups / users via security	Restrict an HR-tagged issue to the HR security level

Role assignments are project-scoped: a user may be a Developer in one project and an Observer in another. This avoids the role explosion that pure global RBAC produces and matches how teams reason about access (“who is in this project, and what can each role do here?”)⁵.

Resolution flow

Permission resolution: deny by default, then short-circuit through global, project (RBAC), and issue-level security checks.

Resolution is short-circuit: if a global permission grants the action, no project / issue check runs. Otherwise the project’s permission scheme is consulted via the user’s project roles, and finally the issue’s security level (if any) gates visibility.

Two often-missed properties:

Inheritance. Sub-tasks inherit the parent’s security level and cannot override it⁴. This is what stops a contractor from being able to see a sub-task whose parent is hidden.
No field-level permissions in Jira. Once an issue is visible, every field on it is visible. Field-level redaction requires either a custom screen or an external authorisation layer⁴.

Schema

1CREATE TABLE project_roles (2    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),3    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,4    name VARCHAR(50) NOT NULL,5    UNIQUE (project_id, name)6);78CREATE TABLE project_role_members (9    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),10    role_id UUID NOT NULL REFERENCES project_roles(id) ON DELETE CASCADE,11    user_id UUID REFERENCES users(id),12    group_id UUID REFERENCES groups(id),13    -- Exactly one of user_id / group_id must be set14    CHECK ((user_id IS NULL) <> (group_id IS NULL))15);16-- Partial unique indexes guarantee no duplicate user / group per role17CREATE UNIQUE INDEX idx_role_members_user18    ON project_role_members (role_id, user_id) WHERE user_id IS NOT NULL;19CREATE UNIQUE INDEX idx_role_members_group20    ON project_role_members (role_id, group_id) WHERE group_id IS NOT NULL;2122CREATE TABLE permission_scheme_grants (23    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),24    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,25    permission VARCHAR(64) NOT NULL,        -- 'EDIT_ISSUES', 'TRANSITION_ISSUES', ...26    role_id UUID REFERENCES project_roles(id) ON DELETE CASCADE,27    group_id UUID REFERENCES groups(id),28    user_id UUID REFERENCES users(id)29);30CREATE INDEX idx_pscheme_lookup ON permission_scheme_grants(project_id, permission);3132CREATE TABLE issue_security_levels (33    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),34    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,35    name VARCHAR(64) NOT NULL,36    UNIQUE (project_id, name)37);3839CREATE TABLE issue_security_level_members (40    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),41    level_id UUID NOT NULL REFERENCES issue_security_levels(id) ON DELETE CASCADE,42    role_id UUID REFERENCES project_roles(id) ON DELETE CASCADE,43    group_id UUID REFERENCES groups(id),44    user_id UUID REFERENCES users(id),45    -- Exactly one of role_id / group_id / user_id must be set46    CHECK (47        (role_id IS NOT NULL)::int48      + (group_id IS NOT NULL)::int49      + (user_id IS NOT NULL)::int = 150    )51);52CREATE INDEX idx_isl_members_level ON issue_security_level_members(level_id);5354ALTER TABLE issues55    ADD COLUMN security_level_id UUID REFERENCES issue_security_levels(id);

Caching authorisation

Per-request resolution against four joins is too slow on hot paths (board load, search). Two safe caches:

Effective-permission cache. (user_id, project_id) → bitset of granted permissions, invalidated on role / scheme change. Lives in Redis with a 5-minute TTL plus pub/sub-driven busting.
Visible-issue filter. For search and listing, materialise per-user (user_id, project_id) → security_level_ids[] and inject the filter into the search query so the engine never returns rows the caller cannot read.

Never cache a deny decision longer than an allow decision — the failure mode is “user briefly sees too much”, which is precisely what authorisation must prevent.

Low-Level Design: Search Subsystem

Issue search has a distinctive shape: many filters (assignee = me AND status in (...) AND label = ...), modest text payloads (titles + descriptions + comments), and a strong demand for typo tolerance and “as-you-type” feedback. The choice of engine matters more than for a typical full-text workload.

Engine selection

Engine	Architecture	Best for	Watch-outs
Postgres FTS	`tsvector` + GIN, in-database	Single-tenant or small multi-tenant; sovereignty	`tsvector` ≤ 1 MB; lexeme positions ≤ 16 384; no native typo⁶
Meilisearch	Single-node Rust	Fast as-you-type, small datasets	Memory-resident index; HA story is weak
Typesense	Distributed C++, Raft	Sweet spot for SaaS scale + simple API	Smaller community; fewer aggregation primitives
OpenSearch / ES	Distributed Java + Lucene	Multi-tenant SaaS, faceted analytics	Operational cost; index sizing and JVM tuning

A pragmatic split many teams adopt: ship Postgres FTS for the first 10⁴ issues per tenant, and promote to OpenSearch / Typesense once a tenant crosses an indexable-bytes threshold. Keep the query API engine-agnostic so the swap is a routing change.

Ingestion pipeline

Issue documents are denormalised projections (issue + status + assignee + comments concatenated for body). They must be eventually consistent with Postgres but can lag the primary store by seconds.

Search ingestion pipeline: outbox + CDC stream issue changes through Kafka into OpenSearch, with index aliases for zero-downtime reindex.

Three patterns sit behind that diagram:

Outbox + CDC — every write to issues also inserts into an outbox table in the same transaction; Debezium (or Postgres logical replication) tails the outbox and publishes to Kafka. Avoids dual-writes drifting on partial failure⁷.
Indexer is idempotent — every doc carries the source version; the indexer drops any update whose version is older than what the index already holds.
Alias-swap reindex — full rebuilds write to issues_v(N+1) and atomically point the issues alias at it once caught up. No downtime, no half-indexed reads.

Query path

A typical board-search query combines text + filters + facets:

1{2  "size": 50,3  "query": {4    "bool": {5      "must":   [{ "multi_match": { "query": "login bug", "fields": ["title^3", "body"] } }],6      "filter": [7        { "term": { "project_id": "p-1" } },8        { "terms": { "security_level_id": ["lvl-public", "lvl-eng"] } },9        { "term": { "status_category": "in_progress" } }10      ]11    }12  },13  "aggs": {14    "by_assignee": { "terms": { "field": "assignee_id", "size": 10 } }15  }16}

The security_level_id filter is injected by the API layer from the per-user visibility cache (above). Never trust a client-supplied security filter — clients only get to choose project, status, assignee, etc.

Low-Level Design: Notifications

Notifications are the system’s most unbounded fan-out path: a single @team mention on a 200-person project can produce 200 deliveries across four channels each. The design priorities are channel isolation, idempotency, and backpressure.

Notification fan-out: domain events flow through a router into per-channel queues, with a digest aggregator for email and a dead-letter queue for failures.

Pipeline

Domain event is published to a Kafka topic (issue.commented, issue.assigned, mention.created).
Router resolves subscribers by union of: assignee, reporter, watchers, mentioned users, project subscribers. It then filters by per-user channel preferences and current presence (no mobile push if the user is online on web — Slack’s well-documented heuristic⁸).
Per-channel queues (in-app, push, email, webhook) decouple delivery so a failing email provider does not block in-app delivery.
Delivery workers call APNs / FCM / SES / outbound webhooks with retry + DLQ. Each worker carries an idempotency key derived from (event_id, user_id, channel) so retries cannot double-deliver.
Digest aggregator holds email events in a per-user window (e.g. 5 minutes for mentions, 24 hours for low-priority changes) and emits one combined message; this is what stops a busy issue from spamming a watcher inbox.

Watcher / subscription model

1CREATE TABLE notification_subscriptions (2    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),3    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,4    target_type VARCHAR(20) NOT NULL,    -- 'issue', 'project', 'epic'5    target_id UUID NOT NULL,6    reason VARCHAR(20) NOT NULL,         -- 'assignee', 'mention', 'watch', 'subscribed_to_project'7    UNIQUE (user_id, target_type, target_id, reason)8);9CREATE INDEX idx_notif_sub_target ON notification_subscriptions(target_type, target_id);1011CREATE TABLE notification_preferences (12    user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,13    channels JSONB NOT NULL DEFAULT '{"in_app":true,"push":true,"email":"digest"}'::jsonb14);

The reason column is what the UI surfaces (“You were assigned”, “You were mentioned”); it is also what allows a user to unsubscribe selectively rather than from the whole project.

Audit log vs notification log

These are two systems, not one:

Activity / audit log (activity_log table, append-only) is the system of record for what changed, who changed it, when. It feeds the issue history view and compliance exports. Never delete from it — soft-deletes only.
Notification log records what was delivered to whom, on which channel, with which result. It is what the inbox reads from and what powers idempotency. It can be aged out (90-day TTL is typical).

Low-Level Design: Attachments

Attachments are the only part of the system with multi-MB payloads on the hot path. The design rule is “blobs in object storage, references in Postgres”. Jira Cloud, GitHub, Linear, and Asana all converge on the same three primitives: presigned uploads, antivirus scanning, and per-tenant quotas⁹.

Attachment upload: client requests a presigned multipart URL, uploads parts directly to S3, the server completes and triggers an antivirus scanner that promotes clean objects to a clean bucket and quarantines infected ones.

Upload contract

Presigned multipart upload. API issues a presigned CreateMultipartUpload URL plus per-part PUT URLs scoped to a single object key in the incoming bucket. The server never proxies the bytes; this keeps API instances small and avoids egress cost spikes.
Quota gate before signing. Tenant size + per-issue size + per-file size caps are enforced at sign time. A signed URL is the authorisation; once issued, S3 will accept the upload, so the gate must fire here.
Mime / extension allow-list is also enforced at sign time. Block executable extensions by default; let admins opt in.

Scan and promote

S3 ObjectCreated events fan out to an antivirus stage:

Option	Notes
Lambda + ClamAV layer	Cheap up to ~250 MB; cold-start friendly; Lambda’s `/tmp` is the bottleneck for huge files
ECS / EKS scanner pool	Required for multi-GB files (CI artefacts, screen recordings); scales horizontally
AWS GuardDuty Malware Protection for S3	Managed alternative; charged per GB scanned; useful when you don’t want to operate ClamAV¹⁰

Clean objects are copied to the clean bucket and an attachments row is committed with the object key and content hash; infected objects are moved to a quarantine bucket and the attachment is marked infected. Only clean attachments are exposed via the download URL.

Download and serving

Downloads are also presigned, scoped per-request to a short TTL (5 minutes), and gated by the same permission resolver as the parent issue.
Image / PDF previews are pre-generated by an async worker writing thumbnails to a sibling key (<key>/preview-256.webp); this keeps the issue card fast and avoids fetching multi-MB originals for the avatar grid.
Cache attachments behind a CDN with Cache-Control: private, max-age=... and use signed URLs as the cache key — public CDN caching of private content is the classic SaaS data-leak.

Frontend Considerations

Board State Management

Normalized data structure:

1interface BoardState {2  // Entities by ID3  issues: Record<string, Issue>4  statuses: Record<string, Status>5  users: Record<string, User>67  // Ordering8  columnOrder: string[] // Status IDs in display order9  issueOrder: Record<string, string[]> // statusId → issueIds in rank order1011  // Pagination12  columnCursors: Record<string, string | null>13  columnHasMore: Record<string, boolean>1415  // UI state16  draggingIssueId: string | null17  dropTargetColumn: string | null18  dropTargetIndex: number | null19}

Why normalized:

Moving an issue updates two arrays, not nested objects
React reference equality works for memoization
Easier to apply real-time updates

Optimistic Updates for Drag-and-Drop

1function useMoveIssue() {2  const [boardState, setBoardState] = useState<BoardState>(initialState)3  const pendingMoves = useRef<Map<string, { previousState: BoardState }>>(new Map())45  const moveIssue = async (issueId: string, toStatusId: string, toIndex: number) => {6    const issue = boardState.issues[issueId]7    const fromStatusId = issue.statusId89    // 1. Save previous state for rollback10    const previousState = structuredClone(boardState)11    pendingMoves.current.set(issueId, { previousState })1213    // 2. Optimistic update14    setBoardState((state) => {15      const newState = { ...state }1617      // Remove from old column18      newState.issueOrder = {19        ...state.issueOrder,20        [fromStatusId]: state.issueOrder[fromStatusId].filter((id) => id !== issueId),21      }2223      // Add to new column at index24      const newColumnOrder = [...(state.issueOrder[toStatusId] || [])]25      newColumnOrder.splice(toIndex, 0, issueId)26      newState.issueOrder[toStatusId] = newColumnOrder2728      // Update issue status29      newState.issues = {30        ...state.issues,31        [issueId]: { ...issue, statusId: toStatusId },32      }3334      return newState35    })3637    // 3. Server request38    const rankAfterId = toIndex > 0 ? boardState.issueOrder[toStatusId]?.[toIndex - 1] : null39    const rankBeforeId = boardState.issueOrder[toStatusId]?.[toIndex] || null4041    try {42      const result = await api.moveIssue({43        issueId,44        toStatusId,45        rankAfterId,46        rankBeforeId,47        version: issue.version,48      })4950      if (!result.success) {51        throw new Error(result.error?.message || "Move failed")52      }5354      // 4. Update with server-assigned rank and version55      setBoardState((state) => ({56        ...state,57        issues: {58          ...state.issues,59          [issueId]: { ...state.issues[issueId], ...result.issue },60        },61      }))6263      pendingMoves.current.delete(issueId)64    } catch (error) {65      // 5. Rollback on failure66      const pending = pendingMoves.current.get(issueId)67      if (pending) {68        setBoardState(pending.previousState)69        pendingMoves.current.delete(issueId)70      }71      toast.error("Failed to move issue. Please try again.")72    }73  }7475  return { boardState, moveIssue }76}

Real-time Update Handling

1function useBoardSubscription(projectId: string) {2  const [boardState, setBoardState] = useState<BoardState>(initialState)34  useEffect(() => {5    const subscription = graphqlClient6      .subscribe({7        query: BOARD_CHANGED_SUBSCRIPTION,8        variables: { projectId },9      })10      .subscribe({11        next: ({ data }) => {12          const event = data.boardChanged1314          setBoardState((state) => {15            // Skip if this is our own optimistic update16            if (pendingMoves.current.has(event.issue.id)) {17              return state18            }1920            switch (event.action) {21              case "MOVED":22                return handleRemoteMove(state, event)23              case "UPDATED":24                return handleRemoteUpdate(state, event)25              case "CREATED":26                return handleRemoteCreate(state, event)27              case "DELETED":28                return handleRemoteDelete(state, event)29              default:30                return state31            }32          })33        },34      })3536    return () => subscription.unsubscribe()37  }, [projectId])3839  return boardState40}4142function handleRemoteMove(state: BoardState, event: BoardEvent): BoardState {43  const { issue, previousStatusId } = event44  const newState = { ...state }4546  // Remove from previous column47  if (previousStatusId && state.issueOrder[previousStatusId]) {48    newState.issueOrder = {49      ...state.issueOrder,50      [previousStatusId]: state.issueOrder[previousStatusId].filter((id) => id !== issue.id),51    }52  }5354  // Add to new column in correct position based on rank55  const currentColumnOrder = state.issueOrder[issue.statusId] || []56  const insertIndex = findInsertIndex(currentColumnOrder, issue.rank, state.issues)5758  const newColumnOrder = [...currentColumnOrder]59  newColumnOrder.splice(insertIndex, 0, issue.id)60  newState.issueOrder[issue.statusId] = newColumnOrder6162  // Update issue data63  newState.issues = {64    ...state.issues,65    [issue.id]: issue,66  }6768  return newState69}

Column Virtualization

For boards with many issues per column, virtualize the issue list:

1import { useVirtualizer } from '@tanstack/react-virtual';23function VirtualizedColumn({4  statusId,5  issueIds6}: {7  statusId: string;8  issueIds: string[]9}) {10  const parentRef = useRef<HTMLDivElement>(null);1112  const virtualizer = useVirtualizer({13    count: issueIds.length,14    getScrollElement: () => parentRef.current,15    estimateSize: () => 80, // Estimated card height16    overscan: 5             // Render 5 extra items for smooth scrolling17  });1819  return (20    <div ref={parentRef} className="column-scroll-container">21      <div22        style={{23          height: `${virtualizer.getTotalSize()}px`,24          position: 'relative'25        }}26      >27        {virtualizer.getVirtualItems().map((virtualItem) => (28          <div29            key={virtualItem.key}30            style={{31              position: 'absolute',32              top: 0,33              left: 0,34              width: '100%',35              transform: `translateY(${virtualItem.start}px)`36            }}37          >38            <IssueCard issueId={issueIds[virtualItem.index]} />39          </div>40        ))}41      </div>42    </div>43  );44}

Infrastructure

Cloud-Agnostic Components

Component	Purpose	Options
API Gateway	Request routing, auth	Kong, Nginx, Traefik
GraphQL Server	Query execution	Apollo Server, Mercurius
Message Queue	Event streaming	Kafka, RabbitMQ, NATS
Cache	Board state, sessions	Redis, Memcached, KeyDB
Search	Full-text search	Elasticsearch, Meilisearch, Typesense
Object Storage	Attachments	MinIO, Ceph, S3-compatible
Database	Primary store	PostgreSQL, CockroachDB

AWS Reference Architecture

Service configurations:

Service	Configuration	Rationale
GraphQL (Fargate)	2 vCPU, 4GB RAM	Stateless, scale on request rate
WebSocket (Fargate)	2 vCPU, 4GB RAM	Connection-bound, ~10K per instance
Workers (Spot)	1 vCPU, 2GB RAM	Cost optimization for async
RDS PostgreSQL	db.r6g.xlarge Multi-AZ	Primary store, read replicas for scale
ElastiCache	r6g.large cluster	Board cache, pub/sub
OpenSearch	m6g.large.search × 3	Search index, 3 nodes for HA

Scaling Considerations

Read-heavy workload:

Read replicas for PostgreSQL
Redis caching for board state
CDN for static assets

WebSocket connections:

Sticky sessions to WebSocket servers
Redis pub/sub for cross-instance broadcast
~10K connections per 4GB instance

Search indexing:

Async indexing via Kafka
Dedicated OpenSearch domain
Index aliases for zero-downtime reindexing

Conclusion

This design provides a flexible issue tracking system with:

O(1) reordering via LexoRank eliminates cascading updates.
Per-column cursor pagination ensures all columns load incrementally.
Optimistic locking handles concurrent edits with minimal conflict.
Project-scoped workflows allow team customisation without global impact.
Real-time sync via a server-assigned lastSyncId plus delta packets gives sub-300 ms propagation and a clean offline-reconnect story.
Hybrid RBAC + issue-level security mirrors how teams reason about access; permission resolution is short-circuit and cached per request.
Outbox + CDC search ingestion keeps OpenSearch eventually consistent without dual-write drift.
Per-channel notification fan-out with per-user digest avoids cross-channel head-of-line blocking.
Presigned multipart uploads and an async antivirus stage keep large blobs out of the API and out of Postgres.

Key architectural decisions:

LexoRank for ordering trades storage (growing strings) for write efficiency.
Per-column pagination over global pagination ensures balanced board views.
Last-write-wins is acceptable for most fields; CRDTs reserved for rich text.
Denormalised Redis cache trades consistency for read performance.
Server-assigned monotonic lastSyncId is preferred over vector / hybrid logical clocks for issue-tracker workloads where conflicts are rare.

Known limitations:

LexoRank requires periodic rebalancing (background job).
Last-write-wins may lose concurrent edits on the same scalar field.
Large boards (>1000 issues) need virtualisation.
Postgres FTS caps out around the tsvector size limit; promote to OpenSearch / Typesense per tenant.

Future enhancements:

Field-level CRDTs for conflict-free concurrent editing on scalar fields where it is worth the cost.
GraphQL federation for microservices decomposition.
Per-tenant search engine routing (Postgres FTS for small tenants, OpenSearch for large).

Appendix

Prerequisites

Distributed systems fundamentals (eventual consistency, optimistic locking)
GraphQL basics (queries, mutations, subscriptions)
React state management patterns
SQL and database design

Terminology

Term	Definition
LexoRank	Lexicographically sortable string for ordering without cascading updates
Optimistic locking	Concurrency control using version numbers to detect conflicts
Workflow	Set of statuses and allowed transitions between them
Fractional indexing	Using real numbers (or strings) for ordering with O(1) insertions
Cursor-based pagination	Using opaque cursors instead of offsets for stable pagination
Last-write-wins (LWW)	Conflict resolution where the latest timestamp wins

Summary

LexoRank ordering enables O(1) drag-and-drop without updating other rows
Per-column pagination with cursor-based approach ensures balanced board loading
Optimistic locking with version field detects concurrent modifications
Project-scoped workflows allow custom statuses without schema changes
GraphQL subscriptions provide real-time updates with sub-300ms propagation
Denormalized Redis cache trades consistency for fast board reads

References

Issue Tracker APIs:

Jira Software Cloud REST API — board and agile endpoints
Jira Cloud Platform REST API — issue and workflow endpoints
Linear Developers — GraphQL API — GraphQL schema and usage
GitHub GraphQL API — issues and projects via GraphQL
Asana API reference — task and section ordering

Ordering Algorithms:

Figma — Realtime Editing of Ordered Sequences — fractional indexing at scale
Atlassian KB — Troubleshooting LexoRank System Issues — bucket model, rebalance thresholds, integrity checks
Atlassian Greenhopper — LexoRankBalanceOperation API — bucket round-robin reference
rocicorp/fractional-indexing — reference implementation

Sync and Real-time:

Scaling the Linear Sync Engine — local-first architecture (first-party)
Reverse-engineering Linear’s sync engine — endorsed by Linear’s CTO; LWW + selective CRDT detail
Conflict-free Replicated Data Types — CRDT resources

Permissions and AuthZ:

JIRA Permissions General Overview — global / project / issue-level layers
Configuring issue-level security (Jira) — security schemes and inheritance rules
Oso — RBAC vs ABAC vs PBAC — access-control model trade-offs

Search:

PostgreSQL — Text Search Limitations — tsvector and lexeme-position caps
Debezium — Outbox Event Router — outbox + CDC pattern
Typesense vs Algolia vs Elasticsearch vs Meilisearch — engine comparison

Notifications and Attachments:

How Slack builds smart notification systems (Courier) — presence-aware routing
Configure file attachments (Jira Cloud) — per-tenant size and quota model
GuardDuty Malware Protection for S3 — managed AV-on-upload

System Design:

Optimistic Concurrency Control — concurrency patterns
Relay Cursor Connections specification — cursor-based pagination contract
Stripe — Idempotent requests — Idempotency-Key semantics
Kulkarni et al., Logical Physical Clocks and Consistent Snapshots — HLC reference

Conflict-free Replicated Data Types — overview of CRDT families used for collaborative text. ↩
Stripe — Idempotent requests — canonical Idempotency-Key header semantics. ↩
Kulkarni et al., Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases (HLC). ↩
JIRA Permissions General Overview and Configuring issue-level security. ↩ ↩² ↩³
Oso, RBAC vs ABAC vs PBAC. ↩
PostgreSQL docs, Text Search — Limitations. ↩
Debezium, Outbox Event Router. ↩
Slack via Courier, How Slack builds smart notification systems. ↩
Atlassian, Configure file attachments. ↩
AWS, GuardDuty Malware Protection for S3. ↩