Design an Email System

A comprehensive system design for building a scalable email service like Gmail or Outlook. This design addresses reliable delivery, spam filtering, conversation threading, and search at scale—handling billions of messages daily with sub-second search latency and 99.99% delivery reliability.

Mermaid diagram — High-level architecture: Separate inbound (receiving) and outbound (sending) mail paths with shared storage and search infrastructure.

Abstract

Email systems solve four interconnected challenges: reliable delivery (messages must never be lost), authentication (prevent spoofing and phishing), spam filtering (99%+ spam blocked with minimal false positives), and fast retrieval (sub-second search across years of messages).

Core architectural decisions:

Decision	Choice	Rationale
Inbound protocol	SMTP (RFC 5321)	Universal standard, store-and-forward resilience
Client access	IMAP (RFC 3501) + REST API	IMAP for desktop clients, REST for web/mobile
Authentication	SPF + DKIM + DMARC	Defense in depth: server auth, content auth, policy
Spam filtering	ML (Naive Bayes) + rules	99.9%+ detection with low false positives
Message storage	Wide-column DB (Cassandra)	Time-series access pattern, horizontal scaling
Search	Inverted index (Elasticsearch)	Full-text search with field-specific filtering
Threading	RFC 5322 headers + heuristics	References header for chain, subject fallback

Key trade-offs accepted:

Store-and-forward adds latency (seconds to minutes) but ensures delivery reliability
Per-message spam analysis increases CPU cost but reduces false positives vs. IP-only blocking
Denormalized message storage increases write cost but enables fast mailbox queries
Eventual consistency for search index (seconds delay) in exchange for write throughput

What this design optimizes:

99.99% delivery success rate with automatic retries
Sub-100ms mailbox listing, sub-500ms full-text search
Blocks 99.9% of spam while keeping false positive rate below 0.01%
Horizontal scaling to billions of messages per day

Requirements

Functional Requirements

Requirement	Priority	Notes
Send emails (SMTP submission)	Core	Authenticated sending via port 587
Receive emails (SMTP inbound)	Core	Accept mail for hosted domains
Web/mobile mailbox access	Core	REST API for modern clients
IMAP access	Core	Desktop client compatibility
Spam filtering	Core	Block spam, phishing, malware
Email authentication	Core	SPF, DKIM, DMARC validation
Full-text search	Core	Search body, subject, participants
Conversation threading	Core	Group related messages
Labels/folders	Core	User organization
Attachments	Core	Store and retrieve file attachments
Contact autocomplete	Extended	Suggest recipients while composing
Scheduled send	Extended	Send at specified future time
Undo send	Extended	Brief cancellation window

Non-Functional Requirements

Requirement	Target	Rationale
Availability	99.99% (4 nines)	Email is critical communication; 52 min/year downtime max
Delivery latency	p99 < 30 seconds	User expectation for “instant” delivery
Search latency	p99 < 500ms	Real-time search experience
Mailbox list latency	p99 < 100ms	Responsive UI on folder open
Spam detection rate	> 99.9%	Unusable inbox without effective filtering
False positive rate	< 0.01%	Legitimate mail must not be blocked
Message durability	99.9999%	No email should ever be lost
Retention	15+ years	Long-term archival for compliance

Scale Estimation

Users:

Monthly Active Users (MAU): 500M
Daily Active Users (DAU): 200M (40% of MAU)
Mailboxes: 500M (1 per user)

Traffic (inbound + outbound):

Messages per user per day: 40 received, 10 sent
Daily inbound: 500M × 40 = 20B messages/day
Daily outbound: 200M × 10 = 2B messages/day
Peak messages per second: 20B / 86400 × 3 (peak multiplier) = ~700K msgs/sec inbound

Storage:

Average message size: 75KB (body + headers, excluding attachments)
Average attachment size: 500KB (only 20% of messages have attachments)
Daily message storage: 20B × 75KB = 1.5PB/day
Daily attachment storage: 20B × 0.2 × 500KB = 2PB/day
15-year retention: ~20EB (with compression, ~5EB)

Search index:

Index size: ~20% of message storage (text extraction)
Daily index growth: ~300TB

Design Paths

Path A: Monolithic MTA (Traditional)

Best when:

Smaller scale (< 1M mailboxes)
On-premises deployment
Simpler operations preferred
Standard email features sufficient

Architecture:

Key characteristics:

Single MTA handles sending, receiving, storage
File-based storage (Maildir or mbox format)
Local spam filtering with SpamAssassin
IMAP server (Dovecot) for client access

Trade-offs:

✅ Simple deployment and operations
✅ Mature, well-understood stack
✅ Low infrastructure cost
❌ Vertical scaling limits (~100K mailboxes per server)
❌ No built-in redundancy
❌ Limited search capabilities
❌ Manual spam rule updates

Real-world example: Traditional enterprise mail servers, small hosting providers, self-hosted mail (Mail-in-a-Box, Mailcow).

Path B: Microservices with Shared Storage (Cloud-Native)

Best when:

Large scale (10M+ mailboxes)
Cloud deployment
Need for advanced features (smart compose, nudges)
Global distribution required

Architecture:

Key characteristics:

Separate services for ingestion, storage, search, sending
Distributed database for messages (Cassandra, Bigtable)
Dedicated search cluster (Elasticsearch)
Object storage for attachments
ML-based spam filtering

Trade-offs:

✅ Horizontal scaling to billions of mailboxes
✅ Independent service scaling
✅ Advanced ML features possible
✅ Multi-region deployment
❌ Complex operations
❌ Higher infrastructure cost
❌ Eventual consistency challenges

Real-world example: Gmail (Bigtable + custom indexing), Outlook.com (Exchange Online + Azure), Fastmail (custom Cyrus-derived stack).

Path C: Hybrid with ESP Integration

Best when:

Need transactional + marketing email
Deliverability is critical concern
Limited email infrastructure expertise
Variable sending volumes

Architecture:

Key characteristics:

Outbound via ESP (managed deliverability)
Inbound via webhooks or forwarding
ESP handles reputation, authentication, compliance
Application focuses on business logic

Trade-offs:

✅ Managed deliverability and reputation
✅ Built-in analytics and tracking
✅ No MTA operations burden
✅ Elastic scaling
❌ Per-message cost at scale
❌ Less control over delivery timing
❌ Vendor lock-in concerns
❌ Limited for receiving mail

Real-world example: SaaS applications using SendGrid/Mailgun for transactional email, marketing platforms using dedicated ESPs.

Path Comparison

Factor	Monolithic	Microservices	Hybrid/ESP
Scale	< 1M mailboxes	Billions	Variable
Complexity	Low	High	Medium
Cost at scale	Lower	Higher	Highest
Deliverability control	Full	Full	Delegated
Feature velocity	Slow	Fast	Medium
Ops burden	Medium	High	Low
Examples	Enterprise Exchange	Gmail, Outlook.com	SaaS apps

This Article’s Focus

This article focuses on Path B (Microservices) because:

Represents architecture of major email providers (Gmail, Outlook)
Demonstrates scale challenges unique to email (spam, threading, search)
Covers both sending and receiving infrastructure
Addresses deliverability, authentication, and compliance concerns

High-Level Design

Inbound Mail Flow

When an external server sends mail to your domain:

MX Server responsibilities:

Connection handling: Accept SMTP connections, enforce rate limits
Recipient validation: Verify mailbox exists before accepting
Authentication checks: SPF, DKIM, DMARC validation
Spam scoring: Pass to spam filter, act on classification
Message queuing: Hand off to storage layer

Why accept-then-filter (not reject during SMTP)?

Rejecting spam during the SMTP transaction (5xx response) causes the sender’s MTA to generate a bounce. Spammers use forged From addresses, so bounces go to innocent parties (backscatter). Accepting and silently filtering avoids this.

Outbound Mail Flow

When a user sends an email:

Outbound MTA responsibilities:

DKIM signing: Cryptographically sign message for authentication
MX resolution: Look up recipient mail servers
Connection pooling: Reuse connections to frequent destinations
Retry management: Exponential backoff for temporary failures
Bounce handling: Process permanent failures, notify sender

Mailbox Service

Handles message storage, retrieval, and organization:

Key operations:

Operation	Description	Access Pattern
List messages	Get messages in folder	Range query by folder + date
Get message	Retrieve full message	Point lookup by message_id
Move/label	Organize messages	Update metadata
Delete	Remove message	Soft delete (trash), hard delete
Search	Full-text query	Search index query
Sync	IMAP/API delta sync	Cursor-based pagination

State per message:

1
interface EmailMessage {
2
  messageId: string // Globally unique (RFC 5322 Message-ID)
3
  internalId: string // System-assigned UUID
4
  mailboxId: string // Owner's mailbox
5
  threadId: string // Conversation grouping
6

7
  // Headers (denormalized for queries)
8
  from: EmailAddress
9
  to: EmailAddress[]
10
  cc: EmailAddress[]
11
  subject: string
12
  date: Date // From Date header
13
  receivedAt: Date // Server receive time
14

15
  // Threading headers
16
  inReplyTo?: string // Message-ID of parent
17
  references: string[] // Full ancestor chain
18

19
  // Content
20
  bodyText?: string // Plain text version
21
  bodyHtml?: string // HTML version
22
  snippet: string // First 200 chars for preview
23

24
  // Organization
25
  labels: string[] // User labels (INBOX, SENT, custom)
26
  isRead: boolean
27
  isStarred: boolean
28

29
  // Metadata
30
  sizeBytes: number
31
  hasAttachments: boolean
32
  attachments: AttachmentRef[]
33

34
  // Spam/security
35
  spamScore: number
36
  authenticationResults: AuthResult
37
}
38

39
interface AttachmentRef {
40
  attachmentId: string
41
  filename: string
42
  contentType: string
43
  sizeBytes: number
44
  storageUrl: string // S3/GCS URL
45
}

Search Service

Full-text search across all message content:

Index structure:

1
interface SearchDocument {
2
  messageId: string
3
  mailboxId: string // Partition key for isolation
4
  threadId: string
5

6
  // Searchable fields
7
  from: string // Tokenized email + name
8
  to: string[]
9
  cc: string[]
10
  subject: string // Tokenized
11
  body: string // Full-text, tokenized
12
  attachmentNames: string[]
13

14
  // Filterable fields
15
  labels: string[]
16
  date: Date
17
  hasAttachment: boolean
18
  isRead: boolean
19
  isStarred: boolean
20

21
  // Spam fields
22
  spamScore: number
23
}

Query capabilities:

Full-text: "quarterly report" (phrase match)
Field-specific: from:alice@example.com
Boolean: from:alice AND has:attachment
Date range: after:2024/01/01 before:2024/06/01
Labels: label:work -label:newsletters

Threading Service

Groups related messages into conversations:

Algorithm (priority order):

References header: RFC 5322 specifies References contains Message-IDs of all ancestors
In-Reply-To header: Direct parent Message-ID
Subject matching: Same subject (ignoring Re:/Fwd: prefixes) within time window
Participant overlap: Same sender/recipients, similar timing

Threading data model:

1
interface Thread {
2
  threadId: string
3
  mailboxId: string
4

5
  // Aggregated from messages
6
  subject: string // From most recent message
7
  snippet: string // From most recent message
8
  participants: EmailAddress[] // Union of all From/To/Cc
9

10
  // Message list
11
  messageIds: string[] // Ordered by date
12
  messageCount: number
13

14
  // Thread-level flags
15
  hasUnread: boolean
16
  isStarred: boolean // Any message starred
17
  labels: string[] // Union of all labels
18

19
  // Timestamps
20
  oldestMessageDate: Date
21
  newestMessageDate: Date
22
}

Edge cases:

Orphaned replies: Message references unknown Message-ID → create new thread, merge if parent arrives
Subject collision: Different conversations with same subject → use timing + participants to disambiguate
Long threads: Threads with 100+ messages → paginate message list

Parameter	Type	Description
`labelIds`	string[]	Filter by labels (default: INBOX)
`q`	string	Search query
`maxResults`	int	Page size (default: 50, max: 500)
`pageToken`	string	Cursor for pagination
`includeSpam`	bool	Include spam folder

Response (200 OK):

1
{
2
  "messages": [
3
    {
4
      "id": "msg_abc123",
5
      "threadId": "thread_xyz789",
6
      "labelIds": ["INBOX", "IMPORTANT"],
7
      "snippet": "Hi team, please review the Q4 report...",
8
      "from": {
9
        "email": "alice@example.com",
10
        "name": "Alice Smith"
11
      },
12
      "to": [{ "email": "bob@example.com", "name": "Bob Jones" }],
13
      "subject": "Q4 Report Review",
14
      "date": "2024-12-15T10:30:00Z",
15
      "isRead": false,
16
      "isStarred": false,
17
      "hasAttachments": true,
18
      "sizeBytes": 125000
19
    }
20
  ],
21
  "nextPageToken": "cursor_def456",
22
  "resultSizeEstimate": 1250
23
}

Get Full Message

Endpoint: GET /api/v1/messages/{messageId}

Query parameters:

Parameter	Type	Description
`format`	enum	`minimal`, `metadata`, `full`, `raw`

Response (200 OK, format=full):

1
{
2
  "id": "msg_abc123",
3
  "threadId": "thread_xyz789",
4
  "labelIds": ["INBOX", "IMPORTANT"],
5
  "headers": {
6
    "from": "Alice Smith <alice@example.com>",
7
    "to": "Bob Jones <bob@example.com>",
8
    "subject": "Q4 Report Review",
9
    "date": "Sun, 15 Dec 2024 10:30:00 -0800",
10
    "message-id": "<unique-id@example.com>",
11
    "in-reply-to": "<parent-id@example.com>",
12
    "references": "<grandparent@example.com> <parent-id@example.com>"
13
  },
14
  "body": {
15
    "text": "Hi team,\n\nPlease review the attached Q4 report...",
16
    "html": "<html><body><p>Hi team,</p>..."
17
  },
18
  "attachments": [
19
    {
20
      "id": "att_file123",
21
      "filename": "Q4-Report.pdf",
22
      "mimeType": "application/pdf",
23
      "size": 2500000
24
    }
25
  ],
26
  "authentication": {
27
    "spf": "pass",
28
    "dkim": "pass",
29
    "dmarc": "pass"
30
  }
31
}

Send Message

Endpoint: POST /api/v1/messages/send

Request:

1
{
2
  "to": [{ "email": "bob@example.com", "name": "Bob Jones" }],
3
  "cc": [],
4
  "bcc": [],
5
  "subject": "Project Update",
6
  "body": {
7
    "text": "Hi Bob,\n\nHere's the update you requested...",
8
    "html": "<p>Hi Bob,</p><p>Here's the update you requested...</p>"
9
  },
10
  "attachments": [
11
    {
12
      "filename": "update.pdf",
13
      "mimeType": "application/pdf",
14
      "content": "base64-encoded-content"
15
    }
16
  ],
17
  "replyTo": "msg_parent123",
18
  "scheduledAt": null
19
}

Response (202 Accepted):

1
{
2
  "id": "msg_new789",
3
  "threadId": "thread_xyz789",
4
  "labelIds": ["SENT"],
5
  "status": "queued"
6
}

Search Messages

Endpoint: GET /api/v1/mailboxes/{mailboxId}/messages?q={query}

Query examples:

from:alice@example.com - From specific sender
"quarterly report" - Phrase match
has:attachment larger:5M - With attachment > 5MB
after:2024/01/01 before:2024/06/30 - Date range
in:inbox is:unread - Inbox, unread only

Response: Same format as List Messages.

Download Attachment

Endpoint: GET /api/v1/messages/{messageId}/attachments/{attachmentId}

Response: Redirects to signed URL (S3/GCS presigned URL, 15-minute expiry).

Error Responses

Code	Error	When
400	`INVALID_REQUEST`	Malformed request body
401	`UNAUTHORIZED`	Invalid or expired token
403	`FORBIDDEN`	No access to mailbox
404	`NOT_FOUND`	Message/mailbox doesn’t exist
413	`ATTACHMENT_TOO_LARGE`	Attachment exceeds 25MB limit
429	`RATE_LIMITED`	Too many requests
503	`SERVICE_UNAVAILABLE`	Temporary outage

Rate limits:

Operation	Limit	Window
Send	500 messages	per day
Send (paid)	2,000 messages	per day
API requests	1,000 requests	per minute
Search	100 queries	per minute

IMAP Protocol Support

For desktop client compatibility, expose standard IMAP (RFC 3501):

Supported commands:

Command	Description
`LOGIN`	Authenticate with username/password or OAuth
`SELECT`	Open mailbox/folder
`SEARCH`	Server-side search
`FETCH`	Retrieve message(s)
`STORE`	Update flags (read, starred, deleted)
`COPY`	Copy to another folder
`EXPUNGE`	Permanently delete
`IDLE`	Push notifications (RFC 2177)

IMAP-to-API mapping:

IMAP folder → API label
IMAP UID → API message ID
IMAP flags → API isRead, isStarred, labels

Data Modeling

Message Storage (Cassandra)

Table design for time-series mailbox access:

1
-- Messages by mailbox and date (primary access pattern)
2
CREATE TABLE messages_by_mailbox (
3
    mailbox_id UUID,
4
    label_id TEXT,
5
    received_at TIMESTAMP,
6
    message_id UUID,
7
    thread_id UUID,
8
    from_email TEXT,
9
    from_name TEXT,
10
    subject TEXT,
11
    snippet TEXT,
12
    is_read BOOLEAN,
13
    is_starred BOOLEAN,
14
    has_attachments BOOLEAN,
15
    size_bytes INT,
16
    PRIMARY KEY ((mailbox_id, label_id), received_at, message_id)
17
) WITH CLUSTERING ORDER BY (received_at DESC, message_id ASC);
18

19
-- Full message content (point lookup)
20
CREATE TABLE messages (
21
    message_id UUID PRIMARY KEY,
22
    mailbox_id UUID,
23
    thread_id UUID,
24
    raw_headers TEXT,
25
    body_text TEXT,
26
    body_html TEXT,
27
    attachments LIST<FROZEN<attachment>>,
28
    authentication_results MAP<TEXT, TEXT>,
29
    spam_score FLOAT,
30
    created_at TIMESTAMP
31
);
32

33
-- Thread aggregation
34
CREATE TABLE threads_by_mailbox (
35
    mailbox_id UUID,
36
    label_id TEXT,
37
    newest_message_at TIMESTAMP,
38
    thread_id UUID,
39
    subject TEXT,
40
    snippet TEXT,
41
    message_count INT,
42
    participant_emails SET<TEXT>,
43
    has_unread BOOLEAN,
44
    PRIMARY KEY ((mailbox_id, label_id), newest_message_at, thread_id)
45
) WITH CLUSTERING ORDER BY (newest_message_at DESC, thread_id ASC);

Why Cassandra:

Time-series optimized (messages ordered by date)
Partition per mailbox+label enables efficient folder queries
Linear horizontal scaling
Tunable consistency (eventual OK for reads, quorum for writes)

Partition sizing:

Target: < 100MB per partition
Heavy mailboxes: Partition by (mailbox_id, label_id, month) to bound growth
Typical mailbox: 10K messages × 1KB metadata = 10MB per label partition

User and Label Metadata (PostgreSQL)

1
CREATE TABLE users (
2
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
3
    email VARCHAR(255) UNIQUE NOT NULL,
4
    display_name VARCHAR(100),
5
    password_hash VARCHAR(255),
6
    created_at TIMESTAMPTZ DEFAULT NOW(),
7
    last_login_at TIMESTAMPTZ,
8
    settings JSONB DEFAULT '{}'
9
);
10

11
CREATE TABLE mailboxes (
12
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
13
    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
14
    email_address VARCHAR(255) UNIQUE NOT NULL,
15
    storage_quota_bytes BIGINT DEFAULT 15000000000,  -- 15GB default
16
    storage_used_bytes BIGINT DEFAULT 0,
17
    message_count INT DEFAULT 0,
18
    created_at TIMESTAMPTZ DEFAULT NOW()
19
);
20

21
CREATE TABLE labels (
22
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
23
    mailbox_id UUID REFERENCES mailboxes(id) ON DELETE CASCADE,
24
    name VARCHAR(100) NOT NULL,
25
    type VARCHAR(20) DEFAULT 'user',  -- 'system' or 'user'
26
    color VARCHAR(7),                  -- Hex color for UI
27
    message_count INT DEFAULT 0,
28
    unread_count INT DEFAULT 0,
29
    UNIQUE(mailbox_id, name)
30
);
31

32
-- System labels created per mailbox: INBOX, SENT, DRAFTS, SPAM, TRASH, ALL
33
CREATE INDEX idx_labels_mailbox ON labels(mailbox_id);

Attachment Storage (S3/GCS)

Storage path convention:

1
s3://email-attachments/{mailbox_id}/{year}/{month}/{message_id}/{attachment_id}/{filename}

Metadata in database:

1
CREATE TABLE attachments (
2
    id UUID PRIMARY KEY,
3
    message_id UUID NOT NULL,
4
    filename VARCHAR(255) NOT NULL,
5
    content_type VARCHAR(100),
6
    size_bytes BIGINT,
7
    storage_bucket VARCHAR(100),
8
    storage_key TEXT,
9
    checksum_sha256 VARCHAR(64),
10
    scanned_at TIMESTAMPTZ,
11
    scan_result VARCHAR(20)  -- 'clean', 'malware', 'pending'
12
);

Lifecycle rules:

Trash attachments: Delete after 30 days
Spam attachments: Delete after 7 days
Regular attachments: Keep until message deleted

Search Index (Elasticsearch)

Index mapping:

1
{
2
  "mappings": {
3
    "properties": {
4
      "message_id": { "type": "keyword" },
5
      "mailbox_id": { "type": "keyword" },
6
      "thread_id": { "type": "keyword" },
7
      "from_email": { "type": "keyword" },
8
      "from_name": { "type": "text" },
9
      "to_emails": { "type": "keyword" },
10
      "to_names": { "type": "text" },
11
      "cc_emails": { "type": "keyword" },
12
      "subject": {
13
        "type": "text",
14
        "analyzer": "email_analyzer"
15
      },
16
      "body": {
17
        "type": "text",
18
        "analyzer": "email_analyzer"
19
      },
20
      "attachment_names": { "type": "text" },
21
      "labels": { "type": "keyword" },
22
      "date": { "type": "date" },
23
      "size_bytes": { "type": "long" },
24
      "has_attachment": { "type": "boolean" },
25
      "is_read": { "type": "boolean" },
26
      "is_starred": { "type": "boolean" }
27
    }
28
  },
29
  "settings": {
30
    "analysis": {
31
      "analyzer": {
32
        "email_analyzer": {
33
          "type": "custom",
34
          "tokenizer": "standard",
35
          "filter": ["lowercase", "email_domain_filter"]
36
        }
37
      }
38
    }
39
  }
40
}

Index per mailbox:

Shard by mailbox_id for query isolation
Typical sizing: 1 shard per 10M messages
Heavy users: Dedicated index with multiple shards

Database Selection Matrix

Data Type	Store	Rationale
User profiles, labels	PostgreSQL	ACID, relational queries, moderate scale
Message metadata	Cassandra	Time-series access, horizontal scaling
Message bodies	Cassandra	Co-located with metadata
Attachments	S3/GCS	Object storage, CDN-compatible
Search index	Elasticsearch	Full-text search, aggregations
Session cache	Redis	Sub-ms latency, TTL support
Rate limiting	Redis	Atomic counters, sliding windows
Delivery queue	Kafka	Reliable async, retry support


8 collapsed lines
1
interface SPFResult {
2
  result: "pass" | "fail" | "softfail" | "neutral" | "none" | "temperror" | "permerror"
3
  domain: string
4
  clientIp: string
5
  explanation?: string
6
}
7

8
class SPFValidator {
9
  async validate(senderDomain: string, clientIp: string): Promise<SPFResult> {
10
    // 1. Query TXT record for SPF policy
11
    const spfRecord = await this.dns.queryTXT(`${senderDomain}`)
12
    // Example: "v=spf1 include:_spf.google.com ~all"
13

14
    if (!spfRecord || !spfRecord.startsWith("v=spf1")) {
15
      return { result: "none", domain: senderDomain, clientIp }
16
    }
17

18
    // 2. Parse and evaluate SPF mechanisms
19
    const mechanisms = this.parseSPF(spfRecord)
20

21
    for (const mechanism of mechanisms) {
22
      const match = await this.evaluateMechanism(mechanism, clientIp, senderDomain)
23
      if (match) {
24
        return {
25
          result: this.qualifierToResult(mechanism.qualifier),
26
          domain: senderDomain,
27
          clientIp,
28
        }
29
      }
30
    }
31

32
    // 3. Default result if no mechanism matches
33
    return { result: "neutral", domain: senderDomain, clientIp }
34
  }
35

36
  private qualifierToResult(qualifier: string): SPFResult["result"] {
37
    switch (qualifier) {
38
      case "+":
39
        return "pass"
40
      case "-":
41
        return "fail"
42
      case "~":
43
        return "softfail"
44
      case "?":
45
        return "neutral"
46
      default:
47
        return "pass"
48
    }
49
  }
50
}

SPF limitations:

Only validates envelope sender (MAIL FROM), not header From
Breaks on forwarding (forwarding server IP not authorized)
10 DNS lookup limit to prevent amplification attacks

DKIM Verification

DomainKeys Identified Mail (RFC 6376) validates message integrity:


12 collapsed lines
1
interface DKIMResult {
2
  result: "pass" | "fail" | "neutral" | "temperror" | "permerror"
3
  domain: string
4
  selector: string
5
  headerFields: string[]
6
}
7

8
class DKIMVerifier {
9
  async verify(message: RawEmail): Promise<DKIMResult> {
10
    // 1. Extract DKIM-Signature header
11
    const signature = this.extractDKIMSignature(message)
12
    if (!signature) {
13
      return { result: "neutral", domain: "", selector: "", headerFields: [] }
14
    }
15

16
    // DKIM-Signature: v=1; a=rsa-sha256; d=example.com; s=selector1;
17
    //   h=from:to:subject:date; bh=base64-body-hash; b=base64-signature
18

19
    // 2. Fetch public key from DNS
20
    const publicKey = await this.dns.queryTXT(`${signature.selector}._domainkey.${signature.domain}`)
21

22
    // 3. Verify body hash
23
    const bodyHash = this.computeBodyHash(message.body, signature.canonicalization.body, signature.algorithm)
24

25
    if (bodyHash !== signature.bodyHash) {
26
      return {
27
        result: "fail",
28
        domain: signature.domain,
29
        selector: signature.selector,
30
        headerFields: signature.headers,
31
      }
32
    }
33

34
    // 4. Verify header signature
35
    const headerData = this.canonicalizeHeaders(message.headers, signature.headers, signature.canonicalization.header)
36

37
    const valid = this.verifySignature(headerData, signature.signature, publicKey, signature.algorithm)
38

39
    return {
40
      result: valid ? "pass" : "fail",
41
      domain: signature.domain,
42
      selector: signature.selector,
43
      headerFields: signature.headers,
44
    }
45
  }
46
}

DKIM key considerations:

RSA 2048-bit minimum (1024-bit deprecated)
Selector rotation: Publish new key, sign with new selector, retire old
Header field selection: Always include From, To, Subject, Date, Message-ID

DMARC Policy Enforcement

Domain-based Message Authentication, Reporting, and Conformance (RFC 7489):


10 collapsed lines
1
interface DMARCResult {
2
  result: "pass" | "fail" | "none"
3
  policy: "none" | "quarantine" | "reject"
4
  alignment: {
5
    spf: boolean
6
    dkim: boolean
7
  }
8
  domain: string
9
}
10

11
class DMARCEvaluator {
12
  async evaluate(headerFrom: string, spfResult: SPFResult, dkimResult: DKIMResult): Promise<DMARCResult> {
13
    const fromDomain = this.extractDomain(headerFrom)
14

15
    // 1. Query DMARC policy
16
    const dmarcRecord = await this.dns.queryTXT(`_dmarc.${fromDomain}`)
17
    // Example: "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"
18

19
    if (!dmarcRecord) {
20
      return {
21
        result: "none",
22
        policy: "none",
23
        alignment: { spf: false, dkim: false },
24
        domain: fromDomain,
25
      }
26
    }
27

28
    const policy = this.parseDMARC(dmarcRecord)
29

30
    // 2. Check alignment (domain in From matches authenticated domain)
31
    const spfAligned = spfResult.result === "pass" && this.domainAligns(fromDomain, spfResult.domain, policy.aspf)
32

33
    const dkimAligned = dkimResult.result === "pass" && this.domainAligns(fromDomain, dkimResult.domain, policy.adkim)
34

35
    // 3. DMARC passes if either SPF or DKIM is aligned
36
    const passes = spfAligned || dkimAligned
37

38
    return {
39
      result: passes ? "pass" : "fail",
40
      policy: policy.p,
41
      alignment: { spf: spfAligned, dkim: dkimAligned },
42
      domain: fromDomain,
43
    }
44
  }
45

46
  private domainAligns(
47
    fromDomain: string,
48
    authDomain: string,
49
    mode: "r" | "s", // relaxed or strict
50
  ): boolean {
51
    if (mode === "s") {
52
      return fromDomain === authDomain
53
    }
54
    // Relaxed: organizational domain must match
55
    return this.getOrgDomain(fromDomain) === this.getOrgDomain(authDomain)
56
  }
57
}

DMARC policy actions:

Policy	Action
`p=none`	Monitor only, no enforcement
`p=quarantine`	Deliver to spam folder
`p=reject`	Reject at SMTP level (or discard)


15 collapsed lines
1
class NaiveBayesSpamFilter {
2
  private spamWordCounts: Map<string, number> = new Map()
3
  private hamWordCounts: Map<string, number> = new Map()
4
  private totalSpam: number = 0
5
  private totalHam: number = 0
6

7
  // Training: update counts from labeled messages
8
  train(message: string, isSpam: boolean): void {
9
    const tokens = this.tokenize(message)
10
    const counts = isSpam ? this.spamWordCounts : this.hamWordCounts
11

12
    for (const token of tokens) {
13
      counts.set(token, (counts.get(token) || 0) + 1)
14
    }
15

16
    if (isSpam) this.totalSpam++
17
    else this.totalHam++
18
  }
19

20
  // Classification: compute P(spam|message)
21
  classify(message: string): { isSpam: boolean; score: number } {
22
    const tokens = this.tokenize(message)
23

24
    // Prior probabilities
25
    const pSpam = this.totalSpam / (this.totalSpam + this.totalHam)
26
    const pHam = 1 - pSpam
27

28
    // Log probabilities to avoid underflow
29
    let logPSpamGivenMessage = Math.log(pSpam)
30
    let logPHamGivenMessage = Math.log(pHam)
31

32
    for (const token of tokens) {
33
      // P(token|spam) with Laplace smoothing
34
      const spamCount = this.spamWordCounts.get(token) || 0
35
      const hamCount = this.hamWordCounts.get(token) || 0
36

37
      const pTokenGivenSpam = (spamCount + 1) / (this.totalSpam + 2)
38
      const pTokenGivenHam = (hamCount + 1) / (this.totalHam + 2)
39

40
      logPSpamGivenMessage += Math.log(pTokenGivenSpam)
41
      logPHamGivenMessage += Math.log(pTokenGivenHam)
42
    }
43

44
    // Convert back to probability
45
    const maxLog = Math.max(logPSpamGivenMessage, logPHamGivenMessage)
46
    const pSpamNormalized = Math.exp(logPSpamGivenMessage - maxLog)
47
    const pHamNormalized = Math.exp(logPHamGivenMessage - maxLog)
48

49
    const score = pSpamNormalized / (pSpamNormalized + pHamNormalized)
50

51
    return {
52
      isSpam: score > 0.9, // High threshold to minimize false positives
53
      score,
54
    }
55
  }
56

57
  private tokenize(text: string): string[] {
58
    return text
59
      .toLowerCase()
60
      .split(/\W+/)
61
      .filter((token) => token.length > 2 && token.length < 20)
62
  }
63
}

Why Naive Bayes works for spam:

Handles high-dimensional feature spaces (thousands of words) efficiently
Trains incrementally (user feedback updates model)
Achieves 99%+ accuracy despite “naive” independence assumption
Computationally cheap (O(n) where n = tokens in message)

Spammer countermeasures and responses:

Attack	Response
Bayesian poisoning (inject ham words)	Weight tokens by information gain
Image-only spam	OCR text extraction
Character substitution (V1agra)	Normalization, character n-grams
URL shorteners	Expand and analyze destination

Heuristic Rules (SpamAssassin-style)


8 collapsed lines
1
interface SpamRule {
2
  name: string
3
  score: number // Positive = spam indicator
4
  test: (message: ParsedEmail) => boolean
5
}
6

7
const SPAM_RULES: SpamRule[] = [
8
  {
9
    name: "SUBJ_ALL_CAPS",
10
    score: 1.5,
11
    test: (msg) => msg.subject === msg.subject.toUpperCase() && msg.subject.length > 10,
12
  },
13
  {
14
    name: "FROM_DISPLAY_MISMATCH",
15
    score: 2.0,
16
    test: (msg) => {
17
      // "PayPal <hacker@evil.com>" - display name doesn't match domain
18
      const displayDomain = msg.fromName?.match(/@?(\w+\.\w+)/)?.[1]
19
      const actualDomain = msg.from.split("@")[1]
20
      return displayDomain && displayDomain !== actualDomain
21
    },
22
  },
23
  {
24
    name: "MISSING_DATE",
25
    score: 1.0,
26
    test: (msg) => !msg.headers["date"],
27
  },
28
  {
29
    name: "FORGED_OUTLOOK_TAGS",
30
    score: 3.0,
31
    test: (msg) => {
32
      // Claims Outlook but missing X-MS headers
33
      const ua = msg.headers["x-mailer"] || ""
34
      return ua.includes("Outlook") && !msg.headers["x-ms-exchange-organization"]
35
    },
36
  },
37
  {
38
    name: "URI_MISMATCH",
39
    score: 2.5,
40
    test: (msg) => {
41
      // Link text says paypal.com but href goes elsewhere
42
      const links = extractLinks(msg.bodyHtml)
43
      return links.some((l) => l.text.includes("paypal.com") && !l.href.includes("paypal.com"))
44
    },
45
  },
46
]
47

48
function computeHeuristicScore(message: ParsedEmail): number {
49
  return SPAM_RULES.filter((rule) => rule.test(message)).reduce((sum, rule) => sum + rule.score, 0)
50
}

Message Delivery Queue

Outbound Queue with Retry Logic


15 collapsed lines
1
interface QueuedMessage {
2
  messageId: string
3
  recipientDomain: string
4
  recipientEmail: string
5
  payload: Buffer // DKIM-signed message
6
  attempts: number
7
  nextAttemptAt: Date
8
  createdAt: Date
9
  expiresAt: Date // 5 days for bounce generation
10
}
11

12
class OutboundQueue {
13
  private readonly kafka: KafkaProducer
14

15
  async enqueue(message: OutboundMessage): Promise<void> {
16
    // Partition by recipient domain for connection pooling
17
    await this.kafka.send({
18
      topic: "outbound-mail",
19
      messages: [
20
        {
21
          key: message.recipientDomain,
22
          value: JSON.stringify({
23
            messageId: message.id,
24
            recipientDomain: message.recipientDomain,
25
            recipientEmail: message.recipient,
26
            payload: message.signedContent,
27
            attempts: 0,
28
            nextAttemptAt: new Date(),
29
            createdAt: new Date(),
30
            expiresAt: new Date(Date.now() + 5 * 24 * 60 * 60 * 1000),
31
          }),
32
        },
33
      ],
34
    })
35
  }
36
}
37

38
class DeliveryWorker {
39
  private readonly RETRY_DELAYS = [
40
    0, // Immediate
41
    5 * 60, // 5 minutes
42
    30 * 60, // 30 minutes
43
    2 * 60 * 60, // 2 hours
44
    8 * 60 * 60, // 8 hours
45
    24 * 60 * 60, // 24 hours
46
  ]
47

48
  async processMessage(queued: QueuedMessage): Promise<void> {
49
    try {
50
      const mxRecords = await this.dns.queryMX(queued.recipientDomain)
51
      const sortedMx = mxRecords.sort((a, b) => a.priority - b.priority)
52

53
      for (const mx of sortedMx) {
54
        try {
55
          await this.deliverToMx(mx.exchange, queued)
56
          await this.markDelivered(queued.messageId)
57
          return
58
        } catch (error) {
59
          if (this.isPermanentError(error)) {
60
            throw error // Don't try other MX servers
61
          }
62
          // Try next MX server
63
          continue
64
        }
65
      }
66

67
      throw new Error("All MX servers failed")
68
    } catch (error) {
69
      if (this.isPermanentError(error) || queued.attempts >= 6) {
70
        await this.generateBounce(queued, error)
71
        await this.markFailed(queued.messageId)
72
      } else {
73
        // Schedule retry
74
        const delay = this.RETRY_DELAYS[queued.attempts + 1] || this.RETRY_DELAYS[5]
75
        await this.scheduleRetry(queued, delay)
76
      }
77
    }
78
  }
79

80
  private isPermanentError(error: any): boolean {
81
    // 5xx errors are permanent (except 552 which can be transient)
82
    const code = error.responseCode
83
    return code >= 500 && code < 600 && code !== 552
84
  }
85
}

Retry backoff schedule:

Attempt	Delay	Cumulative
1	Immediate	0
2	5 minutes	5 min
3	30 minutes	35 min
4	2 hours	2h 35m
5	8 hours	10h 35m
6	24 hours	34h 35m
Bounce	-	~5 days

Threading Algorithm


12 collapsed lines
1
class ThreadingService {
2
  async assignThread(message: IncomingMessage): Promise<string> {
3
    // 1. Check References header (RFC 5322)
4
    if (message.references?.length > 0) {
5
      for (const ref of message.references.reverse()) {
6
        const existingThread = await this.findThreadByMessageId(ref)
7
        if (existingThread) {
8
          return existingThread.threadId
9
        }
10
      }
11
    }
12

13
    // 2. Check In-Reply-To header
14
    if (message.inReplyTo) {
15
      const parentThread = await this.findThreadByMessageId(message.inReplyTo)
16
      if (parentThread) {
17
        return parentThread.threadId
18
      }
19
    }
20

21
    // 3. Subject-based matching (fallback)
22
    const normalizedSubject = this.normalizeSubject(message.subject)
23
    const candidateThreads = await this.findThreadsBySubject(message.mailboxId, normalizedSubject, { withinDays: 30 })
24

25
    // 4. Filter by participant overlap
26
    const messageParticipants = new Set([message.from, ...message.to, ...message.cc])
27

28
    for (const thread of candidateThreads) {
29
      const overlap = thread.participants.filter((p) => messageParticipants.has(p)).length
30

31
      // Require at least 2 participants in common
32
      if (overlap >= 2) {
33
        return thread.threadId
34
      }
35
    }
36

37
    // 5. Create new thread
38
    return this.createThread(message)
39
  }
40

41
  private normalizeSubject(subject: string): string {
42
    // Remove Re:, Fwd:, Fw:, etc. prefixes
43
    return subject
44
      .replace(/^(re|fwd?|aw|sv|antw):\s*/gi, "")
45
      .trim()
46
      .toLowerCase()
47
  }
48
}

Frontend Considerations

Mailbox UI State Management

Normalized store for efficient updates:


12 collapsed lines
1
interface MailboxState {
2
  // Normalized entities
3
  messages: Record<string, MessageSummary>
4
  threads: Record<string, Thread>
5
  labels: Record<string, Label>
6

7
  // View state
8
  currentLabelId: string
9
  messageOrder: string[] // Thread IDs in current view
10
  selectedThreadIds: Set<string>
11

12
  // Pagination
13
  nextPageToken: string | null
14
  isLoading: boolean
15

16
  // Optimistic updates
17
  pendingUpdates: Map<string, OptimisticUpdate>
18
}
19

20
// Update a single message without re-fetching list
21
function updateMessage(state: MailboxState, messageId: string, updates: Partial<MessageSummary>) {
22
  const message = state.messages[messageId]
23
  if (!message) return state
24

25
  return {
26
    ...state,
27
    messages: {
28
      ...state.messages,
29
      [messageId]: { ...message, ...updates },
30
    },
31
  }
32
}

Why normalized:

Marking read: Update 1 object, not scan array
Thread operations: Update thread aggregate, individual messages unchanged
Labels: Add/remove from set, no array reordering

Virtualized Message List

For mailboxes with thousands of messages:


10 collapsed lines
1
interface VirtualListConfig {
2
  containerHeight: number
3
  itemHeight: number // Estimated row height
4
  overscan: number // Extra rows above/below viewport
5
}
6

7
class VirtualMailboxList {
8
  private readonly PAGE_SIZE = 50
9

10
  calculateVisibleRange(scrollTop: number, config: VirtualListConfig): Range {
11
    const startIndex = Math.max(0, Math.floor(scrollTop / config.itemHeight) - config.overscan)
12

13
    const visibleCount = Math.ceil(config.containerHeight / config.itemHeight)
14
    const endIndex = startIndex + visibleCount + config.overscan * 2
15

16
    return { start: startIndex, end: endIndex }
17
  }
18

19
  // Fetch more when approaching end
20
  async onScroll(scrollTop: number, scrollHeight: number): Promise<void> {
21
    const remainingScroll = scrollHeight - scrollTop - window.innerHeight
22

23
    if (remainingScroll < 500 && this.state.nextPageToken && !this.state.isLoading) {
24
      await this.fetchNextPage()
25
    }
26
  }
27
}

Compose Form with Autosave


15 collapsed lines
1
interface DraftState {
2
  draftId: string | null
3
  to: EmailAddress[]
4
  cc: EmailAddress[]
5
  bcc: EmailAddress[]
6
  subject: string
7
  body: string
8
  attachments: AttachmentUpload[]
9
  replyToMessageId: string | null
10
  lastSavedAt: Date | null
11
  isDirty: boolean
12
}
13

14
class ComposeController {
15
  private autosaveTimer: NodeJS.Timeout | null = null
16
  private readonly AUTOSAVE_DELAY = 2000 // 2 seconds after last change
17

18
  onFieldChange(field: keyof DraftState, value: any): void {
19
    this.state = { ...this.state, [field]: value, isDirty: true }
20

21
    // Debounce autosave
22
    if (this.autosaveTimer) {
23
      clearTimeout(this.autosaveTimer)
24
    }
25

26
    this.autosaveTimer = setTimeout(() => this.saveDraft(), this.AUTOSAVE_DELAY)
27
  }
28

29
  async saveDraft(): Promise<void> {
30
    if (!this.state.isDirty) return
31

32
    const response = await this.api.saveDraft({
33
      draftId: this.state.draftId,
34
      to: this.state.to,
35
      subject: this.state.subject,
36
      body: this.state.body,
37
      // ...
38
    })
39

40
    this.state = {
41
      ...this.state,
42
      draftId: response.draftId,
43
      lastSavedAt: new Date(),
44
      isDirty: false,
45
    }
46
  }
47

48
  async send(): Promise<void> {
49
    // Optimistic: show "Sending..." immediately
50
    this.ui.showSendingIndicator()
51

52
    try {
53
      await this.api.sendMessage({
54
        draftId: this.state.draftId,
55
        to: this.state.to,
56
        // ...
57
      })
58

59
      // Success: close compose, show "Sent" with undo option
60
      this.ui.showSentWithUndo(5000) // 5 second undo window
61
      this.close()
62
    } catch (error) {
63
      this.ui.showError("Failed to send. Message saved as draft.")
64
    }
65
  }
66
}

Offline Support


15 collapsed lines
1
class OfflineMailbox {
2
  private db: IDBDatabase // IndexedDB for local storage
3

4
  async cacheMessages(messages: MessageSummary[]): Promise<void> {
5
    const tx = this.db.transaction("messages", "readwrite")
6
    for (const msg of messages) {
7
      await tx.objectStore("messages").put(msg)
8
    }
9
  }
10

11
  async getMessagesOffline(labelId: string): Promise<MessageSummary[]> {
12
    const tx = this.db.transaction("messages", "readonly")
13
    const index = tx.objectStore("messages").index("by-label")
14
    return index.getAll(labelId)
15
  }
16

17
  // Queue actions when offline
18
  async queueAction(action: OfflineAction): Promise<void> {
19
    const tx = this.db.transaction("pendingActions", "readwrite")
20
    await tx.objectStore("pendingActions").add({
21
      id: crypto.randomUUID(),
22
      action,
23
      createdAt: new Date(),
24
    })
25
  }
26

27
  // Sync when back online
28
  async syncPendingActions(): Promise<void> {
29
    const tx = this.db.transaction("pendingActions", "readwrite")
30
    const actions = await tx.objectStore("pendingActions").getAll()
31

32
    for (const { id, action } of actions) {
33
      try {
34
        await this.executeAction(action)
35
        await tx.objectStore("pendingActions").delete(id)
36
      } catch (error) {
37
        // Keep in queue for retry
38
        console.error("Sync failed:", action, error)
39
      }
40
    }
41
  }
42
}

Infrastructure

Cloud-Agnostic Components

Component	Purpose	Options
MTA	Inbound/outbound SMTP	Postfix, Haraka, custom
Message queue	Delivery queue, async processing	Kafka, Pulsar, RabbitMQ
Message store	Email body and metadata	Cassandra, ScyllaDB, DynamoDB
Search	Full-text indexing	Elasticsearch, OpenSearch, Solr
Object store	Attachments	MinIO, Ceph, S3-compatible
Relational DB	User, label metadata	PostgreSQL, CockroachDB
Cache	Session, rate limiting	Redis, KeyDB, Dragonfly

AWS Reference Architecture

Service configurations:

Service	Configuration	Rationale
MX Pods (Fargate)	2 vCPU, 4GB, autoscale 10-100	SMTP is CPU-bound
Spam Filter	4 vCPU, 8GB, GPU optional	ML inference
API Gateway	2 vCPU, 4GB	Stateless REST/GraphQL
IMAP Pods	4 vCPU, 8GB	Connection state
Delivery Workers	2 vCPU, 4GB, Spot	Async, retry-tolerant
Keyspaces	On-demand	Managed Cassandra
OpenSearch	r6g.xlarge × 3	Search workload
ElastiCache Redis	r6g.large cluster	Session, rate limits
MSK	kafka.m5.large × 3	Message queue

Email-Specific Infrastructure

MX record configuration:

1
example.com.    IN MX   10 mx1.example.com.
2
example.com.    IN MX   10 mx2.example.com.
3
example.com.    IN MX   20 mx-backup.example.com.

DNS records for authentication:

1
; SPF
2
example.com.    IN TXT  "v=spf1 ip4:203.0.113.0/24 include:_spf.google.com -all"
3

4
; DKIM
5
selector1._domainkey.example.com.    IN TXT  "v=DKIM1; k=rsa; p=MIIBIjANBg..."
6

7
; DMARC
8
_dmarc.example.com.    IN TXT  "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"

Scaling Considerations

Inbound throughput:

Single MTA pod: ~10K messages/minute (connection limited)
700K messages/second peak → 4,200 MX pods minimum
With headroom (2x): ~10,000 MX pods

Outbound throughput:

Per destination rate limits (Gmail: 500/day per IP for new IPs)
IP reputation warmup: Start 50/day, increase 2x daily
Dedicated IPs per sending reputation tier

Search index lag:

Target: < 30 seconds from receive to searchable
Indexer throughput: ~5K documents/second per node
700K/second peak → 140 indexer pods

Storage growth:

3.5PB/day raw (messages + attachments)
With compression (3:1): ~1.2PB/day
15-year retention: ~6.5EB
Tiered storage: Hot (SSD, 30 days) → Warm (HDD, 1 year) → Cold (S3 Glacier)

Conclusion

This design provides a scalable email system with:

Reliable delivery via store-and-forward queuing with exponential backoff retries
Strong authentication through SPF + DKIM + DMARC defense in depth
Effective spam filtering using ML classification with heuristic rules
Fast retrieval via time-series optimized storage and full-text search indexing
Conversation threading using RFC 5322 headers with subject/participant fallback

Key architectural decisions:

Separate inbound/outbound paths allow independent scaling and different reliability requirements
Cassandra for messages provides time-series access patterns and horizontal scaling
Elasticsearch enables sub-second full-text search across billions of messages
Kafka queues decouple receipt from processing, enabling async spam filtering and indexing

Known limitations:

Search index lag (up to 30 seconds) means very recent messages may not appear in search
Spam model requires continuous training on user feedback to adapt to new attacks
Threading heuristics can fail for long-running threads with subject changes
Large attachments (>25MB) require chunked upload/download handling

Future enhancements:

AI-powered smart compose and reply suggestions
Proactive phishing detection using link analysis
Federated identity for cross-organization encryption
Real-time collaborative inbox for team email

Appendix

Prerequisites

SMTP protocol fundamentals (commands, response codes, envelope vs. headers)
DNS record types (MX, TXT, CNAME)
Distributed systems concepts (eventual consistency, partitioning)
Full-text search fundamentals (inverted indexes, tokenization)

Terminology

Term	Definition
MTA	Mail Transfer Agent; server that routes email between domains (Postfix, Sendmail)
MUA	Mail User Agent; email client (Outlook, Thunderbird, web interface)
MX record	DNS record specifying mail servers for a domain
Envelope	SMTP-level sender/recipient (MAIL FROM, RCPT TO); distinct from message headers
SPF	Sender Policy Framework; DNS-based authorization of sending IPs
DKIM	DomainKeys Identified Mail; cryptographic message signing
DMARC	Domain-based Message Authentication, Reporting, and Conformance; policy layer
Bounce	Non-delivery report (NDR); message informing sender of delivery failure
Backscatter	Bounces sent to forged sender addresses; a form of spam

Summary

Email systems separate inbound (MX servers, spam filtering, storage) from outbound (submission, DKIM signing, delivery queue) flows
Authentication trifecta (SPF + DKIM + DMARC) prevents spoofing: SPF checks sending IP, DKIM verifies content integrity, DMARC enforces policy
Naive Bayes achieves 99%+ spam detection by computing P(spam|tokens) with incremental training from user feedback
Cassandra provides time-series optimized message storage with partition-per-mailbox for efficient folder queries
Elasticsearch enables sub-500ms full-text search across years of messages with field-specific filtering
Threading uses RFC 5322 References/In-Reply-To headers with subject and participant matching as fallback

References

Protocol Specifications:

RFC 5321 - Simple Mail Transfer Protocol - SMTP specification
RFC 3501 - IMAP4rev1 - IMAP specification
RFC 5322 - Internet Message Format - Email message format
RFC 2045-2049 - MIME - Multipurpose Internet Mail Extensions

Authentication Standards:

Spam Filtering:

A Plan for Spam - Paul Graham - Foundational Bayesian spam filtering paper
Machine Learning for Email Spam Filtering (PMC) - ML approaches survey

Industry Implementations: