A case study on solving a real infrastructure problem fast — from zero to full analytical access in under 3 months.
The Problem
SpinCity Solutions is a B2C iGaming operator. When I joined as Data Architect, their BI Manager had a straightforward problem with no straightforward solution: the company had accumulated over 6 terabytes of historical data across their production databases, and nobody could query it.
The databases were operational systems — designed to serve the live platform, not to handle analytical workloads. Running a complex query across the full historical dataset would either time out, degrade platform performance, or both. The BI team had effectively lost access to their own history.
This meant no trend analysis. No year-over-year comparisons. No ability to answer questions like "how did player behaviour change after we launched this feature six months ago?" The BI Manager was making decisions based on whatever recent data they could extract without crashing the system.
What I Built
I designed and delivered an ETL pipeline that moved the full 6+ TB dataset from the production databases onto Amazon S3, structured for analytical access.
The approach:
- Extraction: Incremental pulls from the source databases, scheduled to avoid peak operational hours. The critical constraint was that extraction couldn't impact the live platform's performance
- Transformation: Cleaning, deduplication, and restructuring the data into a format optimised for analytical queries rather than transactional operations
- Storage: Landing the transformed data on S3 in a columnar format, partitioned by time periods to enable efficient range queries
- Query layer: Setting up a query engine on top of S3 so the BI team could run SQL against the full dataset without any infrastructure management
The whole project was delivered in under 3 months — from initial assessment to the BI team running their first queries against the full historical dataset.
The Outcome
The BI team went from being unable to query historical data at all to having full analytical access across the entire 6+ TB dataset, with query response times under 20 minutes for complex cross-historical analyses.
For context: these were queries that literally could not run before. The BI Manager went from making educated guesses to making data-informed decisions with full historical context.
The solution was also designed with cost in mind. SpinCity is a smaller operator, not a company with an unlimited cloud budget. Using S3 as the storage layer with on-demand querying meant they only paid for what they used, rather than maintaining expensive always-on analytical infrastructure.
What I Learned
Speed matters more than perfection in startup environments. SpinCity needed this solved in months, not quarters. I evaluated open-source tooling deliberately, avoided over-engineering, and focused on getting the BI team productive as fast as possible. The "right" architectural decision was the one that delivered value quickly while leaving room to evolve.
The before/after story is the most powerful metric. "6TB migrated" is a number. "BI Manager went from unable to query to full historical access" is a story. When I describe this project, the transformation is what resonates — not the technical details of the ETL pipeline.
Vendor lock-in is a real concern for smaller operators. I recommended open-source tooling specifically to avoid creating a dependency on expensive proprietary platforms. For a company of SpinCity's size, the difference between a $500/month and a $5,000/month data platform matters. Making that recommendation — and being able to justify it technically — is part of the job.
I'm Julian Calleja, a Senior Data Engineer focused on real-time data platforms in iGaming. Currently building at Elantil, previously at GiG and SpinCity. Get in touch if you want to talk about data infrastructure or iGaming.