Skip to content

Data handling

This page is for procurement, security teams, and anyone who needs to answer “what does Marriska actually do with our data?” before signing off on the platform.

For the auth/access side specifically (sessions, API keys, role checks), see Security model.

  • Test definitions: the natural-language description you wrote, the parsed steps, any variables, the target URL.
  • Test sets: groupings of tests, names, ordering.
  • Schedules: cron expression, timezone, browser selection, notification recipients.
  • Tags, projects, folders: organizational metadata.
  • Step results: pass/fail, duration in milliseconds, error message (when failed).
  • Step screenshots: stored as LargeBinary blobs in Postgres (column screenshot_data on test_run_results). One per step per browser.
  • Visual baselines: when you opt into visual regression, the baseline screenshot is stored against the step ID + browser.
  • Logs: per-step structured log entries (step_log) with the action, target, URL at the time of the step.
  • Users: email, display name, password hash (bcrypt) for email/password accounts; provider-issued IDs for OAuth (Google, GitHub).
  • Organizations: name, plan tier, max member count.
  • Memberships: who’s in which org and at what role (owner / admin / member / viewer).
  • API keys: stored as SHA-256 hashes, never the raw key. The prefix (ak_live_••••••••...abcd) is kept for display.
  • Sessions: tokens are hashed; we store the device label and the last-used time.
  • Database: PostgreSQL 16. Test definitions, runs, screenshots, baselines, users, sessions — all in one Postgres cluster.
  • Screenshots and baselines specifically: in Postgres LargeBinary columns, not on a separate object store. (For the curious: this is intentional — backup, retention, and access control all collapse to one system.)
  • In-flight execution state: in process memory on the executor, cleaned up when the run finishes.
  • Self-hosted option: every byte of the above can run on your infrastructure if you prefer (STORAGE_MODE=database against your own Postgres). Enterprise tier includes the self-hosted option.

There’s an InMemory storage mode (STORAGE_MODE=memory) used for development and ephemeral testing — runs vanish on backend restart. Production uses STORAGE_MODE=database.

Each AI task ships only what it needs to do its job:

TaskSent to provider
Translation (non-English description → English)The natural-language text you wrote
Parsing (English → structured steps)The English text + a static system prompt
Visual comparisonTwo PNG screenshots (baseline + current), encoded as base64
Generation (when you ask for a generated test)The prompt you provided

Nothing else from the run leaves the platform on these calls. Step results, screenshots from non-visual steps, user identity, org name — none of that goes to the AI provider.

When BYOK is configured (today: env-var-level; per-user-key UI in progress — see BYOK), these calls go to your provider account instead of ours. That puts the inference logs in your provider’s dashboard rather than ours.

We log:

  • Auth failures and successes (without the password or token).
  • API key usage — the key’s hashed prefix and last-used timestamp, not the raw value.
  • Per-request metadata — endpoint, status code, duration.
  • Sentry errors if SENTRY_DSN is configured (production).

We don’t log:

  • Passwords. Ever.
  • Raw API keys. Tokens are masked after the first 8 characters in any log line that touches them.
  • Test page contents (HTML/DOM dumps). The browser screenshot and the step result are persisted; the page source isn’t.

Run history is retained per-tier:

  • Free: 14 days
  • Starter: 90 days
  • Pro: 1 year
  • Team and Enterprise: unlimited

The history_retention_days value is enforced at view-and-downgrade time today — when you downgrade, the impact preview shows what shrinks. Programmatic background pruning is on the roadmap.

Deleting a test removes the definition and its variable data. Run history for past executions is retained per the retention policy above.

There’s no in-app “delete my account” button today. Email support@marriska.com and include the account email; we delete the user record, sessions, and (on request) the org’s data within the SLA your tier defines.

  • Active sessions show under Settings → Security → Active Sessions. Revoke individual sessions or sign out everywhere.
  • Email verification links expire after 15 minutes (EMAIL_VERIFICATION_EXPIRE_MINUTES).
  • Invite links expire after 7 days.
  • Within an org: members see what their role allows. Viewers can read; members can edit and run; admins/owner can manage settings, members, and API keys.
  • Across orgs: every database read filters by the caller’s org_id. Cross-org reads return 403. The middleware enforces this on every protected route.
  • Reports: today, every report URL requires the recipient to be a member of the same org. Public share tokens aren’t shipped — see Sharing reports.
  • In transit: HTTPS by default for the production API (https://api.marriska.com); the security headers middleware adds X-Content-Type-Options: nosniff, X-Frame-Options: DENY, X-XSS-Protection: 1; mode=block, Referrer-Policy: strict-origin-when-cross-origin, and a no-store cache directive on every /api/ response.
  • At rest: Postgres-level encryption depends on your hosting provider (Railway / your own infra). API keys are SHA-256 hashed before storage; passwords are bcrypt-hashed.
  • Per-user secrets (when per-user BYOK ships): will be encrypted in the database. The store isn’t shipped yet — see BYOK.
  • Sell or share data with third parties. Period.
  • Train models on your tests. AI provider calls go through their standard inference APIs — not their training pipelines. (For provider-side guarantees on this, refer to your chosen provider’s data-handling docs — OpenAI, Anthropic, etc., all publish API data-use policies.)
  • Store payment-card data. When real Stripe ships, card data lives in Stripe; we hold the customer ID, not the card.
  • Security model — auth, isolation, RBAC, the access-control side
  • BYOK — moving AI inference to your own provider account
  • Plan tier limits — history retention by tier