ADR-0018: Shared Attachments Table — Staged Upload, Polymorphic Link, and Soft Delete

Status

Accepted

Tags

fieldforce, s3, seaweedfs, upload, attachments, minio-go, shared-infrastructure

Decision

File attachments across all modules are stored in a single shared attachments table. The client generates a UUID for each file before upload. The upload endpoint (POST /api/organizations/:org_id/attachments) accepts multiple files in one multipart request, streams each to SeaweedFS via minio-go, and returns records immediately with status = 'pending' — no entity link is required at upload time. When a form is submitted (e.g., Log Activity), the request includes attachment_ids. Go atomically links them to the created entity via a single UPDATE ... WHERE status = 'pending' AND deleted_at IS NULL RETURNING id. The returned IDs are validated against the submitted IDs — a mismatch means one or more IDs were invalid or already used. Orphaned pending attachments are soft-deleted (via deleted_at) by a cleanup job after 24 hours. S3 objects are deleted before the row is soft-deleted. No hard deletes from the attachments table.

Why

A module-scoped attachment table (e.g., ff_attachments) requires duplication when other modules need the same capability. A shared table lets future modules (leave, support tickets, etc.) attach files without new migrations. Staged upload (pending → linked) decouples file selection from form submission — users can upload a photo before filling in the form, and the UX is non-blocking. Activity or task records are never created as stubs just to obtain an ID for the upload. Client-generated UUIDs serve as idempotency keys — if the client retries an upload (network drop, browser refresh), INSERT ... ON CONFLICT (id) DO NOTHING returns the existing row without re-uploading to S3. The polymorphic link (entity_type + entity_id) on the attachments side avoids junction tables per module and keeps attachment queries to a single indexed scan. No DB FK constraint exists on this link, but the atomic UPDATE ... RETURNING pattern enforces at application layer that only pending attachments from the correct org can be linked — the pre-check and the write happen in the same statement with no race window. Soft delete preserves an audit trail of all uploads. Hard deletes complicate S3 reconciliation and remove forensic history. Rejected alternatives:
  • Module-scoped table (ff_attachments): Requires a new table per module. No shared query pattern. Replaced by shared attachments.
  • Junction table per module: Adds joins; FK integrity is a benefit, but the atomic UPDATE ... RETURNING pattern achieves equivalent safety at application layer without the join cost.
  • Client uploads to S3 directly (presigned URL or frontend S3 credentials): Requires S3 credentials in the frontend environment. Cycle Planner’s signS3Request is not in a shared package. Rejected — Go owns all S3 I/O.
  • Hard delete for cleanup: Removes audit trail, complicates S3 reconciliation. Rejected.

How it works

Schema:
CREATE TABLE attachments (
  id          UUID PRIMARY KEY,                    -- client-generated
  org_id      TEXT NOT NULL,
  file_url    TEXT NOT NULL,                       -- /api/media/proxy?key=...
  s3_key      TEXT NOT NULL,                       -- internal; used by cleanup to delete from S3
  file_type   TEXT NOT NULL,                       -- 'photo' | 'signature' | extensible
  status      TEXT NOT NULL DEFAULT 'pending',     -- 'pending' | 'linked'
  entity_type TEXT,                                -- NULL until linked; e.g. 'ff_activity'
  entity_id   UUID,                                -- NULL until linked
  uploaded_by TEXT NOT NULL,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  deleted_at  TIMESTAMPTZ                          -- NULL = active; set = soft-deleted
);

CREATE INDEX ON attachments (entity_type, entity_id) WHERE deleted_at IS NULL;
CREATE INDEX ON attachments (status, created_at)     WHERE deleted_at IS NULL;
Upload flow (multiple files, idempotent):
POST /api/organizations/:org_id/attachments
Content-Type: multipart/form-data
Fields: files[] (binary), ids[] (client UUID per file), file_types[] (one per file)

For each (id, file, file_type):
  INSERT INTO attachments (id, org_id, file_url, s3_key, file_type, uploaded_by)
  VALUES (?, ?, ?, ?, ?, ?)
  ON CONFLICT (id) DO NOTHING RETURNING *
  → conflict: return existing row — skip S3 upload (idempotent retry)
  → new: stream file → SeaweedFS via minio-go PutObject, insert row status='pending'

Returns: [{ id, file_url, file_type, status }, ...]
Link flow (on form submit):
POST /api/organizations/:org_id/fieldforce/tasks/:task_id/activities
Body: { type, note, attachment_ids: ["uuid1", "uuid2"] }

Go:
  1. Create ff_activities row
  2. UPDATE attachments
       SET entity_type = 'ff_activity', entity_id = <new_activity_id>, status = 'linked'
     WHERE id = ANY(attachment_ids)
       AND status = 'pending'
       AND org_id = <org_id>
       AND deleted_at IS NULL
     RETURNING id
  3. len(returned) ≠ len(attachment_ids) → 422 "one or more attachment IDs are invalid or already used"
Cleanup flow (third pass in overdue job ticker):
SELECT id, s3_key FROM attachments
WHERE status = 'pending'
  AND created_at < now() - interval '24 hours'
  AND deleted_at IS NULL

For each row:
  minioClient.RemoveObject(ctx, bucket, row.S3Key)  -- delete from SeaweedFS
  if S3 error: log, continue (still soft-delete DB row)
  UPDATE attachments SET deleted_at = now() WHERE id = ?
S3 key format: fieldforce/{org_id}/{uuid}.{ext} — no activity scope in the key since the activity doesn’t exist at upload time.

Known limitations

  • No DB FK on entity_id: Application layer is the only enforcement. Mitigated by the atomic UPDATE ... RETURNING pattern — a bug that bypasses the WHERE clause could link to an invalid entity.
  • Soft-deleted row accumulation: Rows with deleted_at IS NOT NULL stay in the table indefinitely. Add a hard purge of rows where deleted_at < now() - interval '90 days' in a future maintenance job.
  • S3 delete failure during cleanup: If RemoveObject fails, the DB row is still soft-deleted and the S3 object lingers. Log the error; reconcile via periodic S3 inventory comparison if needed.
  • Cleanup job scope: The cleanup pass currently runs inside the fieldforce overdue job. When other modules adopt attachments, move it to a shared infrastructure job.
  • Online-only uploads: Phase 2 uploads require connectivity. Offline queuing is Phase 4 (IndexedDB via FieldforceStorageAdapter).

Rules for agents

  • NEVER hard delete from attachments — always set deleted_at = now()
  • Upload MUST use the client-generated UUID as the primary key — never generate the ID server-side
  • Upload MUST use INSERT ... ON CONFLICT (id) DO NOTHING — never upsert or update existing rows on conflict
  • Linking MUST use a single atomic UPDATE ... WHERE status = 'pending' AND deleted_at IS NULL RETURNING id — never a separate SELECT then UPDATE
  • s3_key MUST NOT appear in any API response — it is an internal column for cleanup only
  • Upload endpoint is at org level (/api/organizations/:org_id/attachments) — never under a module-specific path
  • Cleanup MUST attempt S3 deletion before setting deleted_at; if S3 fails, still soft-delete the DB row and log the error
  • S3 key format for fieldforce is fieldforce/{org_id}/{uuid}.{ext} — no activity_id in the key

Bad pattern (do not generate)

// Server-generated ID — breaks client idempotency
attachment.ID = uuid.New() // wrong — use the client-provided ID from the request

// Separate SELECT + UPDATE — race condition between check and write
attachments, _ := repo.FindByIDs(ctx, ids)
for _, a := range attachments {
    if a.Status != "pending" { return error } // window here — another request can link between check and write
}
repo.LinkToActivity(ctx, activityID, ids) // wrong

// Hard delete
db.Delete(&attachment) // wrong — always soft delete

// s3_key in API response
type AttachmentResponse struct {
    S3Key string `json:"s3_key"` // wrong — internal only
}

// Upload endpoint under a module path
router.POST("/fieldforce/activities/:id/attachments", ...) // wrong — shared endpoint at org level

Good pattern

// Idempotent upload — client provides the UUID
for i, fh := range fileHeaders {
    s3Key := fmt.Sprintf("fieldforce/%s/%s%s", orgID, req.IDs[i], filepath.Ext(fh.Filename))
    f, _ := fh.Open(); defer f.Close()
    h.minioClient.PutObject(ctx, cfg.Bucket, s3Key, f, fh.Size,
        minio.PutObjectOptions{ContentType: fh.Header.Get("Content-Type")})
    proxyURL := fmt.Sprintf("/api/media/proxy?key=%s", url.QueryEscape(s3Key))
    db.Exec(`
        INSERT INTO attachments (id, org_id, file_url, s3_key, file_type, status, uploaded_by)
        VALUES (?, ?, ?, ?, ?, 'pending', ?)
        ON CONFLICT (id) DO NOTHING
    `, req.IDs[i], orgID, proxyURL, s3Key, req.FileTypes[i], uploaderID)
}

// Atomic link on form submit
var linkedIDs []string
db.Raw(`
    UPDATE attachments
       SET entity_type = ?, entity_id = ?, status = 'linked'
     WHERE id = ANY(?) AND status = 'pending' AND org_id = ? AND deleted_at IS NULL
     RETURNING id
`, "ff_activity", activityID, pq.Array(attachmentIDs), orgID).Scan(&linkedIDs)
if len(linkedIDs) != len(attachmentIDs) {
    return echo.NewHTTPError(422, "one or more attachment IDs are invalid or already used")
}

// Soft delete in cleanup job
for _, row := range orphans {
    if err := h.minioClient.RemoveObject(ctx, cfg.Bucket, row.S3Key, minio.RemoveObjectOptions{}); err != nil {
        h.logger.Error("s3 cleanup failed", zap.String("id", row.ID), zap.Error(err))
    }
    db.Exec(`UPDATE attachments SET deleted_at = now() WHERE id = ?`, row.ID)
}
// Frontend — generate UUID before upload, send with file
const id = crypto.randomUUID()
const form = new FormData()
form.append('ids[]', id)
form.append('files[]', file)
form.append('file_types[]', 'photo')
const [attachment] = await api.post(`/attachments`, form)
// later, on form submit:
await api.post(`/fieldforce/tasks/${taskId}/activities`, { note, attachment_ids: [id] })