Data handling

This page is for procurement, security teams, and anyone who needs to answer “what does Marriska actually do with our data?” before signing off on the platform.

For the auth/access side specifically (sessions, API keys, role checks), see Security model.

What we store

About the test itself

Test definitions: the natural-language description you wrote, the parsed steps, any variables, the target URL.
Test sets: groupings of tests, names, ordering.
Schedules: cron expression, timezone, browser selection, notification recipients.
Tags, projects, folders: organizational metadata.

About each run

Step results: pass/fail, duration in milliseconds, error message (when failed).
Step screenshots: stored as LargeBinary blobs in Postgres (column screenshot_data on test_run_results). One per step per browser.
Visual baselines: when you opt into visual regression, the baseline screenshot is stored against the step ID + browser.
Logs: per-step structured log entries (step_log) with the action, target, URL at the time of the step.

About people and orgs

Users: email, display name, password hash (bcrypt) for email/password accounts; provider-issued IDs for OAuth (Google, GitHub).
Organizations: name, plan tier, max member count.
Memberships: who’s in which org and at what role (owner / admin / member / viewer).
API keys: stored as SHA-256 hashes, never the raw key. The prefix (ak_live_••••••••...abcd) is kept for display.
Sessions: tokens are hashed; we store the device label and the last-used time.

Where data lives

Database: PostgreSQL 16. Test definitions, runs, screenshots, baselines, users, sessions — all in one Postgres cluster.
Screenshots and baselines specifically: in Postgres LargeBinary columns, not on a separate object store. (For the curious: this is intentional — backup, retention, and access control all collapse to one system.)
In-flight execution state: in process memory on the executor, cleaned up when the run finishes.
Self-hosted option: every byte of the above can run on your infrastructure if you prefer (STORAGE_MODE=database against your own Postgres). Enterprise tier includes the self-hosted option.

There’s an InMemory storage mode (STORAGE_MODE=memory) used for development and ephemeral testing — runs vanish on backend restart. Production uses STORAGE_MODE=database.

What gets sent to AI providers

Each AI task ships only what it needs to do its job:

Task	Sent to provider
Translation (non-English description → English)	The natural-language text you wrote
Parsing (English → structured steps)	The English text + a static system prompt
Visual comparison	Two PNG screenshots (baseline + current), encoded as base64
Generation (when you ask for a generated test)	The prompt you provided

Nothing else from the run leaves the platform on these calls. Step results, screenshots from non-visual steps, user identity, org name — none of that goes to the AI provider.

When BYOK is configured (today: env-var-level; per-user-key UI in progress — see BYOK), these calls go to your provider account instead of ours. That puts the inference logs in your provider’s dashboard rather than ours.

What gets logged

We log:

Auth failures and successes (without the password or token).
API key usage — the key’s hashed prefix and last-used timestamp, not the raw value.
Per-request metadata — endpoint, status code, duration.
Sentry errors if SENTRY_DSN is configured (production).

We don’t log:

Passwords. Ever.
Raw API keys. Tokens are masked after the first 8 characters in any log line that touches them.
Test page contents (HTML/DOM dumps). The browser screenshot and the step result are persisted; the page source isn’t.

Retention and deletion

History retention by tier

Run history is retained per-tier:

Free: 14 days
Starter: 90 days
Pro: 1 year
Team and Enterprise: unlimited

The history_retention_days value is enforced at view-and-downgrade time today — when you downgrade, the impact preview shows what shrinks. Programmatic background pruning is on the roadmap.

Deleting a test or test set

Deleting a test removes the definition and its variable data. Run history for past executions is retained per the retention policy above.

Deleting an account

There’s no in-app “delete my account” button today. Email support@marriska.com and include the account email; we delete the user record, sessions, and (on request) the org’s data within the SLA your tier defines.

Sessions and tokens

Active sessions show under Settings → Security → Active Sessions. Revoke individual sessions or sign out everywhere.
Email verification links expire after 15 minutes (EMAIL_VERIFICATION_EXPIRE_MINUTES).
Invite links expire after 7 days.

Who can see what

Within an org: members see what their role allows. Viewers can read; members can edit and run; admins/owner can manage settings, members, and API keys.
Across orgs: every database read filters by the caller’s org_id. Cross-org reads return 403. The middleware enforces this on every protected route.
Reports: today, every report URL requires the recipient to be a member of the same org. Public share tokens aren’t shipped — see Sharing reports.

Encryption

In transit: HTTPS by default for the production API (https://api.marriska.com); the security headers middleware adds X-Content-Type-Options: nosniff, X-Frame-Options: DENY, X-XSS-Protection: 1; mode=block, Referrer-Policy: strict-origin-when-cross-origin, and a no-store cache directive on every /api/ response.
At rest: Postgres-level encryption depends on your hosting provider (Railway / your own infra). API keys are SHA-256 hashed before storage; passwords are bcrypt-hashed.
Per-user secrets (when per-user BYOK ships): will be encrypted in the database. The store isn’t shipped yet — see BYOK.

What we don’t do

Sell or share data with third parties. Period.
Train models on your tests. AI provider calls go through their standard inference APIs — not their training pipelines. (For provider-side guarantees on this, refer to your chosen provider’s data-handling docs — OpenAI, Anthropic, etc., all publish API data-use policies.)
Store payment-card data. When real Stripe ships, card data lives in Stripe; we hold the customer ID, not the card.

Security model — auth, isolation, RBAC, the access-control side
BYOK — moving AI inference to your own provider account
Plan tier limits — history retention by tier