A case study on solving a real infrastructure problem fast — from zero to full analytical access in under 3 months.


The Problem

SpinCity Solutions is a B2C iGaming operator. When I joined as Data Architect, their BI Manager had a straightforward problem with no straightforward solution: the company had accumulated over 6 terabytes of historical data across their production databases, and nobody could query it.

The databases were operational systems — designed to serve the live platform, not to handle analytical workloads. Running a complex query across the full historical dataset would either time out, degrade platform performance, or both. The BI team had effectively lost access to their own history.

This meant no trend analysis. No year-over-year comparisons. No ability to answer questions like "how did player behaviour change after we launched this feature six months ago?" The BI Manager was making decisions based on whatever recent data they could extract without crashing the system.

What I Built

I designed and delivered an ETL pipeline that moved the full 6+ TB dataset from the production databases onto Amazon S3, structured for analytical access.

The approach:

  • Extraction: Incremental pulls from the source databases, scheduled to avoid peak operational hours. The critical constraint was that extraction couldn't impact the live platform's performance
  • Transformation: Cleaning, deduplication, and restructuring the data into a format optimised for analytical queries rather than transactional operations
  • Storage: Landing the transformed data on S3 in a columnar format, partitioned by time periods to enable efficient range queries
  • Query layer: Setting up a query engine on top of S3 so the BI team could run SQL against the full dataset without any infrastructure management

The whole project was delivered in under 3 months — from initial assessment to the BI team running their first queries against the full historical dataset.

The Outcome

The BI team went from being unable to query historical data at all to having full analytical access across the entire 6+ TB dataset, with query response times under 20 minutes for complex cross-historical analyses.

For context: these were queries that literally could not run before. The BI Manager went from making educated guesses to making data-informed decisions with full historical context.

The solution was also designed with cost in mind. SpinCity is a smaller operator, not a company with an unlimited cloud budget. Using S3 as the storage layer with on-demand querying meant they only paid for what they used, rather than maintaining expensive always-on analytical infrastructure.

What I Learned

Speed matters more than perfection in startup environments. SpinCity needed this solved in months, not quarters. I evaluated open-source tooling deliberately, avoided over-engineering, and focused on getting the BI team productive as fast as possible. The "right" architectural decision was the one that delivered value quickly while leaving room to evolve.

The before/after story is the most powerful metric. "6TB migrated" is a number. "BI Manager went from unable to query to full historical access" is a story. When I describe this project, the transformation is what resonates — not the technical details of the ETL pipeline.

Vendor lock-in is a real concern for smaller operators. I recommended open-source tooling specifically to avoid creating a dependency on expensive proprietary platforms. For a company of SpinCity's size, the difference between a $500/month and a $5,000/month data platform matters. Making that recommendation — and being able to justify it technically — is part of the job.


I'm Julian Calleja, a Senior Data Engineer focused on real-time data platforms in iGaming. Currently building at Elantil, previously at GiG and SpinCity. Get in touch if you want to talk about data infrastructure or iGaming.

How I Unlocked 6TB of Historical Data for a BI Team That Was Flying Blind

What I've Built

Data Migration Framework

Gaming Innovation Group (GiG)

Designed and delivered a migration solution that enabled GiG to onboard large, established operators with existing customer bases. Migrated 20,000+ customer records in under 30 minutes.

20,000+ records < 30 min migration
Referenced in GiG's Q1 2024 earnings call →

Broker CRM — Real-Time Pipeline

Gaming Innovation Group (GiG)

Built a real-time CRM data pipeline delivering live player events to platforms like Symplify, enabling 10+ operator clients to trigger personalised actions based on real-time behaviour.

300k events/hour 10+ operator clients

6TB Data Warehouse Migration

SpinCity Solutions

Took a BI team from being unable to query historical data to full analytical access across a 6+ TB dataset in under 20 minutes, by designing an ETL pipeline onto S3.

6+ TB migrated < 20 min query time

Platform Performance Optimisation

Elantil

Achieved significant performance gains on the current data platform through pipeline backfill optimisations.

59.5% throughput increase 10.4% faster execution

Where I've Worked

Elantil

Senior Data Engineer · 2024–Present

Building the data platform for an iGaming startup. Product launches, client demos, and go-to-market collaboration.

SpinCity Solutions

Data Architect · 2024

Designed and shipped a new data warehouse in 3 months for a B2C iGaming operator.

Gaming Innovation Group

Data Engineer → Senior · 2019–2024

4+ years building real-time data products used by 10+ operators across casino, sports, and lottery.

Bit8

QA Engineer · 2018–2019

First iGaming role. Built the test automation foundation for the platform.

Verticals I've Worked Across

Casino Sports Betting Lottery Crypto Gambling Affiliate Marketing Player Acquisition B2B Platform B2C Operator

Want to talk data, iGaming, or building things?

Whether it's a potential collaboration, a technical challenge, or just a conversation.