BigQuery & Google Cloud Logo

Building a Scalable BigQuery Data Foundation with Google Cloud Platform and CoreSignal Enrichment for a Mid-Sized Asset Management Company

Twopir delivered a medallion-style BigQuery warehouse with automated biannual ingestion, data standardization, and rate-limited CoreSignal enrichment—creating trusted gold-layer tables for analytics, relationship insights, and strategic growth.

The Client:

A mid-sized asset management company in North America needed a reliable, scalable data foundation to empower distribution and marketing teams to identify investment opportunities, map consultant-client relationships, and support data-driven growth strategies.

Twopir implemented a harmonized BigQuery environment with biannual data ingestion from multiple sources, rigorous quality standardization, and CoreSignal-powered company enrichment, culminating in production-ready gold-layer dimensions and facts ready for analytics and downstream tools.

The Problem:

Data processes were manual, inconsistent, and time-intensive, limiting the ability to generate timely insights for distribution and marketing teams.

01. Inconsistent Biannual Data Ingestion

  • Data arrived from multiple sources in varying formats (CSV/XLSX), causing cycle-over-cycle schema drift and heavy manual cleanup.

02. Gaps in Key Entity Data

  • Company and relationship records had missing attributes and quality issues, hindering accurate segmentation and opportunity identification.

03. Scalable External Enrichment Challenges

  • Enrichment via external APIs (like CoreSignal) needed to handle large volumes while respecting rate limits and ensuring repeatability for future cycles.

Tool and Use Case:

Google Cloud Platform + BigQuery

BigQuery: Serverless, scalable analytics warehouse for harmonized data. Google Cloud Storage (GCS): Secure landing zone for raw files.

Use Case:

  • Centralize multi-source raw data into a governed warehouse
  • Standardize schemas, clean entities, and enable repeatable processing
  • Support analytics and downstream integrations with trusted outputs
BigQuery Medallion Architecture Overview

BigQuery + CoreSignal Enrichment

CoreSignal: API for fresh company firmographic and profile enrichment. BigQuery Notebooks + Cloud Functions/Tasks: Orchestration and rate-limit-safe execution.

Use Case:

  • Programmatically fill gaps in company attributes (size, industry, financials, etc.)
  • Scale enrichment across thousands of records while respecting API limits
  • Integrate enriched data into gold-layer dimensions for reliable analytics
CoreSignal Enrichment Flow

Our Approach:

Twopir followed a phased, structured methodology to build a future-proof data platform emphasizing reliability, automation, and team independence.

Phase 1: Diagnostic Audit

  • Analyzed historical datasets, ingestion patterns, and quality issues
  • Identified key entities (companies, relationships) and pain points
  • Defined roadmap for medallion architecture and enrichment

Phase 2: Implementation & Automation

  • Built GCS landing → BigQuery bronze/silver/gold pipelines
  • Developed SQL transformations and CoreSignal orchestration
  • Implemented error handling, retries, and rate-limit controls

Phase 3: Refinement & Enablement

  • Optimized queries, partitioning, and performance
  • Created runbooks, documentation, and training
  • Enabled internal team to run future cycles autonomously

The Solution:

Standardized Multi-Source Ingestion

Raw files landed in structured GCS folders with metadata; external tables in BigQuery for validation before bronze loading—ensuring auditability and traceability.

Medallion Architecture in BigQuery

  • Bronze: Raw/minimally processed data
  • Silver: Cleaned, standardized, deduplicated entities
  • Gold: Curated, enriched dimensions & facts for analytics

Automated Transformations

Parameterized BigQuery SQL scripts handled standardization, deduplication (via window functions), and relationship mapping—reusable across cycles.

Scalable External Enrichment

Missing company attributes enriched via CoreSignal API; batching and throttling ensured compliance with rate limits.

BigQuery and CoreSignal Integration – Implementation Details

1. Data Landing (GCS → BigQuery)

  • Biannual files uploaded to partitioned GCS folders
  • External tables for schema validation
  • Batch loads into bronze tables

2. Transformation Logic (BigQuery SQL)

  • Deterministic matching and standardization
  • Deduplication with ranking/window functions
  • Relationship mapping across sources

3. CoreSignal Enrichment Orchestration

  • BigQuery Notebooks for logic development
  • Batch processing with rate-limit throttling
  • Cloud Functions + Cloud Tasks for retry-safe API calls

4. Gold Layer Outputs

  • dim_company: Enriched, deduplicated master with firmographics
  • dim_relationship: Consultant-client mappings
  • fact_distribution_activity: Engagement and activity metrics
Gold Layer Schema & Outputs

Impact and Outcomes:

The platform shifted the company from manual, error-prone processes to automated, scalable data operations—unlocking faster insights and strategic agility.

50–70% Reduction in Manual Data Preparation

Automation eliminated repetitive cleanup per biannual cycle.

20–40% Improvement in Data Completeness

CoreSignal enrichment filled key firmographic gaps reliably.

Cycle Processing Time Reduced from Days to Hours

Repeatable pipelines and orchestration accelerated delivery dramatically.

Overall: Distribution and marketing teams now access trusted, enriched data for better opportunity identification and relationship visibility—reducing dependency on manual work and enabling scalable growth.

Relatable? We should definitely talk.

All that we’ll cover when we speak:

  • Identify revenue bottlenecks & growth opportunities in your current Salesforce and HubSpot setup
  • Improve lead qualification & pipeline conversion using automation and AI
  • Optimize CRM, Marketing, and Sales workflows for better visibility and performance
  • RP, Order Management & backend system integrations
  • AI-driven process optimization (lead scoring, routing, forecasting, agent-based automation)
  • Custom training & enablement for Sales, Marketing, and Revenue Operations teams
  • Discuss your key pain points, upcoming projects, or ongoing support needs
Book a Meeting →

Book your Meeting here