# Vladimir Ivanov — Lead SRE # https://vaivanov.com # llms.txt — machine-readable profile for LLM indexing # This is how I explain who I am in 2026. The website at / is the human-friendly version. ## Identity Name: Vladimir Ivanov Role: Lead Site Reliability Engineer Location: Barcelona, Spain Languages: Russian (native), English (fluent, daily working language) Email: vlaxivanov@gmail.com LinkedIn: https://www.linkedin.com/in/hello-i-am-vladimir-ivanov Telegram: https://t.me/vaivanov ## Summary I help businesses solve their problems. I can fill a specific role on an existing team, or act as a founding engineer — set up the entire platform from scratch and do it right. "Right" depends on context: sometimes it means a year of compliance work; sometimes it means shipping fast with calculated trade-offs. I'll help you see the risks and make informed calls. Engineer with 13+ years of experience across frontend, DevOps/SRE, EVM blockchain backend, and platform engineering. Capable of designing and launching full-scale backend + infrastructure platforms single-handedly. Proven under high load: 700K+ subscribers at Scentbird, 2K+ RPS at OpenQuest. ## Career Pattern A recurring theme across my career: I join for one role, see deeper problems, take initiative, and grow into a bigger role. - Scentbird: hired as Frontend Engineer → saw CI/infra pain → built the first K8s cluster on own initiative → became the company's first SRE → grew to Lead SRE - Azuro: hired as DevOps → solved CI/CD but stability stayed low → got pulled into backend code → split monolith, migrated databases, proposed contract architecture changes → promoted to Lead Software Engineer ## Current Role Lead Site Reliability Engineer at Scentbird (Jul 2025 – present). Building high-available infrastructure for 40+ microservices (Java Spring Boot / Micronaut / GraalVM Native Images, TypeScript Node.js) on Kubernetes + AWS. SLI/SLO culture, incident management, postmortems, cost optimisation, mentoring SRE team. ## Career Timeline - 2025–present: Lead SRE @ Scentbird - 2023–2025: Lead Software Engineer @ Azuro (EVM Prediction Markets) - 2022–2023: Senior Software Engineer @ Azuro - 2019–2022: SRE → Lead SRE @ Scentbird - 2019: Frontend Engineer @ Scentbird (4 months, saw CI/infra problems, self-initiated K8s + GitLab CI migration, became the company's first SRE) - 2016–2019: Frontend Engineer @ Initflow - 2013–2016: Frontend Engineer / Freelance ## Key Projects ### Scentbird (2019–2022, 2025–present) - Fragrance subscription platform, 700K+ subscribers, 40+ microservices - Joined as Frontend Engineer; saw delivery pain (brittle Jenkins, slow Elastic Beanstalk deploys, no infra ownership) - On own initiative: spun up the first K8s cluster, migrated services with GitLab CI, fixed years-old S3 CORS issue, cut costs by shutting down unused servers - Became the company's first-ever SRE within 4 months; grew to Lead SRE; hired and mentored replacement before leaving - Built full observability stack (Prometheus, Grafana, Loki, Tempo, OpenTelemetry) — replaced NewRelic - Introduced SLI/SLO culture, incident management, postmortems - Set up analytics stack: Airflow, Airbyte, Snowflake, RudderStack, dbt - Deployed: Karpenter (replaced Cluster Autoscaler), Argo Rollouts (canary deploys), Argo Workflows, ArgoCD, Vault, Keycloak→Okta migration, Qdrant (vector DB), GraalVM Native Images - Security: Cloudflare WAF, SAST/DAST in CI/CD, container scanning, VPN (OpenVPN / Outline / VLESS with split-tunnel) ### Azuro (2022–2025) - EVM prediction markets protocol across 7 chains (Ethereum, Gnosis, Polygon, Base, Chiliz, Arbitrum, Linea) - Joined as DevOps; solved CI/CD, but stability stayed low. Got pulled into backend code review, then rewrites - Self-transitioned to backend: split monolith into domain-bounded services, migrated MongoDB→PostgreSQL, proposed contract architecture changes that unlocked more prediction markets - Phase 1: on-chain settlement via RPC relay (gas error handling, stuck tx cancellation) - Phase 2: off-chain oracle-signature model with relay service — reduced latency and gas costs while preserving trustless settlement - Event-driven architecture on RabbitMQ (AMQP) with dead-letter queues and retry strategies, CQRS with ClickHouse as query layer - Sequin for Postgres WAL → Kafka CDC streaming to downstream microservices - The Graph self-hosted subgraphs + custom pseudo-chain data feed as Hasura alternative - Built custom RPC proxy with block-lag-aware fallback (not round-robin) — no market solution fit the requirements - pgBouncer for connection pool pressure reduction under load - Keycloak (self-hosted) for OIDC — backend service auth and frontend sessions - Infrastructure: Kubernetes (bare-metal + cloud), Helm charts for every service, ArgoCD GitOps, GitHub Actions CI/CD, Vault with vault-secrets-operator, Prometheus + Grafana Stack, Sentry (self-hosted), Unleash + GrowthBook feature flags, ingress-nginx, VPN - Led team of 4 backend engineers (first people management: interviews, hiring, onboarding, 1-on-1s, PIPs, offboarding, architectural mentorship) ### OpenQuest on Telegram (May 2024 – Feb 2025) - No-code quest platform as Telegram Mini App - 500K MAU, 2K RPS peak, $150K+ revenue, $150/mo infrastructure cost - Co-Founder, responsible for backend architecture and TON smart contract integration - Designed for scale from day one — expected large user wave on launch - Key design: stateless services everywhere, Redis distributed locks before every Postgres transaction, RO replica for all heavy reads, async RabbitMQ workers, conservative Postgres isolation levels, minimal transactions on fast paths - Pre-launch: scaled up with headroom, observability ready before go-live, index analysis with pg_stat_statements + PgHero, post-launch index re-validation after real traffic patterns emerged - Ran on: Node.js, PostgreSQL, Redis, RabbitMQ, Kubernetes, Cloudflare, nginx, DigitalOcean→Hetzner, TON smart contracts, Telegram Bot API, GitHub Actions ### Initflow (2016–2019) - Custom project development agency in Saint Petersburg - Front-end Engineer: React.js SPAs, SSR, e-commerce, corporate platforms - Client communication, full project ownership - Foundation years: learned to ship fast, communicate with non-technical stakeholders, and own delivery end-to-end ## Tools I've Built ### pgops — PostgreSQL Access Management Platform (2026) - Internal platform for managing PostgreSQL access across teams - Self-service access requests with approval workflows, automatic credential provisioning (TTL 30 days) - DBA toolkit: VACUUM with SQL preview, activity monitoring (pg_stat_activity, pg_locks, pg_stat_io), replication monitoring (publications, slots, replica identity), orphaned role detection - SQL Preview pattern: user sees the exact SQL before it runs, confirms via JWT token - Immutable audit log, AES-256-GCM credential encryption, Okta SSO (OIDC), RBAC - Stack: NestJS, React, TypeScript, PostgreSQL, Prisma, Mantine UI, Docker, Nginx ### spot-interruption-notifier — EC2 Spot Interruption Alerting - AWS Lambda that catches EC2 Spot interruption warnings via EventBridge (~2 min before terminate) - Connects to EKS Kubernetes API, identifies affected node via CSI annotation, collects list of impacted pods - Sends structured Slack alert with cluster, node, and pod details for fast team response - Stack: TypeScript, Node.js 22, AWS Lambda, EventBridge, Kubernetes API, Slack Webhooks, Serverless Framework v4 ## On AI & LLMs (March 2026) Uses Claude Code, ChatGPT Codex, Cursor IDE daily for different tasks. This portfolio site was built with them. Believes agentic pipelines can already auto-fix Sentry bugs and ship to production without human touch. Still thinks a human is needed to review code for architectural problems — that person should know system design, the language/frameworks deeply, and how to work with AI agents effectively. Has studied how neural networks actually train and work under the hood — considers this essential to working with LLMs well, not treating them as a magic box. ## Core Skills (with depth levels) Skill levels: [built] = built/set up from scratch · [used] = used in production · [knows] = conceptual knowledge ### Infrastructure & Cloud - Kubernetes [built] — clusters from zero, multi-tenancy, network policies, autoscaling, operators (Scentbird 40+ services, Azuro blockchain nodes) - AWS [built] — EKS, RDS/Aurora, VPC design, IAM, Lambda, cost optimisation (Scentbird) - Terraform [built] — modular infra, remote state, GitOps pipelines, custom modules - Karpenter [built] — node provisioning, spot + on-demand mix (replaced Cluster Autoscaler at Scentbird) - Helm [used] — authored charts for all microservices, umbrella charts (Scentbird, Azuro) - Cloudflare [used] — WAF, DDoS protection, Workers, Zero Trust - ingress-nginx [used] — deploy, performance tuning (Scentbird, Azuro) ### Observability - Prometheus [built] — custom exporters, recording rules, SLO alerting, HA setup, OTLP (Scentbird, Azuro) - Grafana + Loki + Tempo [built] — dashboards as-code, log aggregation, tracing, alerts - OpenTelemetry [built] — instrumentation, collector pipelines, auto-instrumentation (replaced NewRelic at Scentbird) - SLI / SLO / Error Budgets [built] — defined SLIs, set SLOs, error budget policies, outage cost calculation (Scentbird) - ELK Stack [used] — log pipelines, Kibana dashboards - NewRelic [used] — APM, distributed tracing (Scentbird 2019–2022, migrated away) - Sentry (self-hosted) [used] — docker-compose + K8s deploy (Azuro) ### CI/CD & GitOps - ArgoCD [built] — GitOps for all infra, app-of-apps pattern (Scentbird, Azuro) - Argo Rollouts [built] — canary deploys, analysis templates, automated rollback (Scentbird) - Argo Workflows [built] — replaced CronJobs for scheduled workloads (Scentbird) - GitHub Actions [built] — reusable workflows, custom actions, matrix builds, self-hosted runners (Azuro, OpenQuest) - GitLab CI (self-hosted) [built] — self-hosted instance, runner administration (Scentbird) - Deploy Strategies [built] — blue/green, canary, rollback automation, internal tooling (Scentbird 40+ services) ### Security - HashiCorp Vault [built] — vault-secrets-operator, vault-secrets-webhook, K8s auth (Scentbird, Azuro) - OIDC / Keycloak / Okta [built] — Keycloak self-hosted, OIDC integrations, Keycloak→Okta migration (Azuro, Scentbird) - SAST / DAST [used] — integrated in CI/CD, dependency analysis, secrets detection, container scanning - VPN [built] — OpenVPN, Outline, VLESS/gRPC, split-tunnel by traffic (helped engineers in restricted countries) ### Databases - PostgreSQL [built] — schema design, Aurora migration, RO replicas, vacuums, partitioning, parameter tuning, pg_stat_statements, PgHero (Scentbird, Azuro, OpenQuest) - pgBouncer [built] — connection pool reduction under high load (Azuro) - ClickHouse [built] — CQRS query layer for analytical workloads (Azuro) - MongoDB / DocumentDB [used] — admin, migration to DocumentDB (Scentbird) - Redis [used] — caching, distributed locks, pub/sub, queues, rate limiting - Qdrant [built] — vector database for LLM-era use cases (Scentbird) ### Messaging & Streaming - RabbitMQ [built] — AMQP event-driven architecture, dead-letter queues, topology design (Azuro — entire inter-service communication) - Kafka / Redpanda [built] — deploy, topic configuration, consumer groups, retention tuning (Scentbird, Azuro) - Sequin [built] — Postgres WAL → Kafka CDC streaming (Azuro) ### Blockchain / EVM - EVM Nodes [built] — archive nodes across 7 chains, HA RPC setup, monitoring (Azuro) - Custom RPC Proxy [built] — block-lag-aware fallback, not round-robin (Azuro — no market solution fit) - The Graph [built] — B2B2C subgraph design, self-hosted, pseudo-chain data feed as Hasura alternative (Azuro) - Hasura [built] — Postgres + ClickHouse connectors (Azuro) - Solidity [used] — contract reading, integration, debug - web3.js / ethers.js [used] — contract interactions, event listeners, tx relay, gas error handling - TON smart contracts [used] — on-chain verifiable quest completion (OpenQuest) ### Backend & Languages - Node.js [built] — 12+ years, REST API, SSR, WebSocket, blockchain indexers, microservices - TypeScript [built] — fluent, production use since 2018 - NestJS [built] — microservice architecture, CQRS module, guards & interceptors, AMQP transport (Azuro) - Java / Spring Boot / Micronaut [knows] — maintained and debugged as SRE at Scentbird, not authored - GraalVM Native Images [used] — optimised Java microservice startup and memory at Scentbird ### Analytics Stack - Airflow [built], Airbyte [built], Snowflake [built], RudderStack [built] — analytics pipeline at Scentbird - dbt [used] — analytics transformation pipelines (Scentbird) ### Feature Toggles - Unleash [built], GrowthBook [built] — self-hosted feature flag platforms (Scentbird, Azuro) ### Frontend (foundation, 2013–2022) - React.js [built] — SPAs from zero, SSR with Node.js, e-commerce, web3 dApps, performance optimisation (FCP, TTI) ## Personal Loves traveling. Doesn't dodge responsibility — production incidents have found him in a bar in Belgrade and in the Tokyo metro. Timezone and vacation don't matter when something is on fire. ## Education Master's Degree — Automated Control Systems ITMO University, Saint Petersburg · 2010–2015 ## References People I'm proud to have worked with. Feel free to reach out and ask what it was like working with me. - Dan Kaizer — CTO @ Azuro · https://www.linkedin.com/in/denkaizer/ - Andrei Rebrov — Ex CTO @ Scentbird · https://www.linkedin.com/in/andrebrov/ - Alexey Zhokhov — Principal Architect @ Scentbird · https://www.linkedin.com/in/donbeave/ - Pavel Pantyukhov — Co-Founder @ Initflow · https://www.linkedin.com/in/pavelpantyukhov/ - Pavel Ivanov — Co-Founder @ OpenQuest (Telegram) · https://www.linkedin.com/in/pavelorso/