470 lines
14 KiB
Markdown
470 lines
14 KiB
Markdown
---
|
|
title: "Alphanumeric CNPJ Migration - 100 Million Records"
|
|
slug: "cnpj-migration-database"
|
|
summary: "Execution of massive CNPJ migration from numeric to alphanumeric in database with ~100M records, using phased commit strategy to avoid database locks."
|
|
client: "Collection Agency"
|
|
industry: "Collections & Financial Services"
|
|
timeline: "In execution"
|
|
role: "Database Architect & Tech Lead"
|
|
image: ""
|
|
tags:
|
|
- SQL Server
|
|
- Database Migration
|
|
- CNPJ
|
|
- Performance Optimization
|
|
- Batch Processing
|
|
- Big Data
|
|
featured: true
|
|
order: 4
|
|
date: 2024-11-01
|
|
seo_title: "Alphanumeric CNPJ Migration - 100M Records | Carneiro Tech"
|
|
seo_description: "Case study of massive CNPJ migration in database with 100 million records using phased commits and performance optimizations."
|
|
seo_keywords: "database migration, SQL Server, CNPJ, batch processing, performance optimization, phased commits"
|
|
---
|
|
|
|
## Overview
|
|
|
|
A collection agency that works with transitory data databases (no proprietary software) needs to adapt its systems to the new Brazilian **alphanumeric CNPJ** format.
|
|
|
|
**Main challenge:** Migrate ~**100 million records** in tables with `BIGINT` and `NUMERIC` columns to `VARCHAR`, without locking the production database.
|
|
|
|
**Status:** Project in execution (migration script preparation).
|
|
|
|
---
|
|
|
|
## Challenge
|
|
|
|
### Massive Data Volume
|
|
|
|
**Company context:**
|
|
- Collection agency (does not develop proprietary software)
|
|
- Works with **transitory data** (high turnover)
|
|
- SQL Server database with critical volume
|
|
|
|
**Initial analysis revealed:**
|
|
|
|
| Table | Column | Current Type | Records | Size |
|
|
|--------|--------|------------|-----------|---------|
|
|
| Debtors | CNPJ_Debtor | BIGINT | 8,000,000 | 60 GB |
|
|
| Transactions | CNPJ_Payer | NUMERIC(14) | 90,000,000 | 1.2 TB |
|
|
| Companies | CNPJ_Company | BIGINT | 2,500,000 | 18 GB |
|
|
| **TOTAL** | - | - | **~100,000,000** | **~1.3 TB** |
|
|
|
|
**Identified problems:**
|
|
|
|
1. **Tables with 8M+ rows** using `BIGINT` for CNPJ
|
|
2. **90 million records** in transactions table
|
|
3. **CNPJ as primary key** in some tables
|
|
4. **Foreign keys** relating multiple tables
|
|
5. **Impossibility of extended downtime** (24/7 operation)
|
|
6. **Disk space restrictions** (requires efficient strategy)
|
|
|
|
---
|
|
|
|
## Strategic Decision: Phased Commits
|
|
|
|
### Why NOT do ALTER COLUMN directly?
|
|
|
|
**Naive approach (DOESN'T work):**
|
|
|
|
```sql
|
|
-- NEVER DO THIS ON LARGE TABLES
|
|
ALTER TABLE Transactions
|
|
ALTER COLUMN CNPJ_Payer VARCHAR(18);
|
|
```
|
|
|
|
**Problems:**
|
|
- Locks entire table during conversion
|
|
- Can take hours/days on large tables
|
|
- Blocks all operations (INSERT, UPDATE, SELECT)
|
|
- Risk of timeout or failure mid-operation
|
|
- Complex rollback if something goes wrong
|
|
|
|
---
|
|
|
|
### Chosen Strategy: Column Swap with Phased Commits
|
|
|
|
**Based on previous experience**, I decided to use a gradual approach:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────┐
|
|
│ 1. Create new VARCHAR column at END │
|
|
│ (fast operation, doesn't lock table) │
|
|
└─────────────────────────────────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ 2. UPDATE in batches (phased commits) │
|
|
│ - 100k records at a time │
|
|
│ - Pause between batches (avoid lock) │
|
|
└─────────────────────────────────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ 3. Remove PKs and FKs │
|
|
│ (after 100% migrated) │
|
|
└─────────────────────────────────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ 4. Rename columns (swap) │
|
|
│ - CNPJ → CNPJ_Old │
|
|
│ - CNPJ_New → CNPJ │
|
|
└─────────────────────────────────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ 5. Recreate PKs/FKs with new column │
|
|
└─────────────────────────────────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────┐
|
|
│ 6. Validation and old column deletion │
|
|
└─────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Why this approach?**
|
|
|
|
**No complete table lock** (incremental operation)
|
|
**Can pause/resume** at any time
|
|
**Real-time progress monitoring**
|
|
**Simple rollback** (just drop new column)
|
|
**Minimizes production impact** (small commits)
|
|
|
|
**Decision based on:**
|
|
- Previous experience with large volume migrations
|
|
- Knowledge of SQL Server locks
|
|
- Need for zero downtime
|
|
|
|
**Note:** This decision was made **without consulting AI** - based purely on practical experience from previous projects.
|
|
|
|
---
|
|
|
|
## Implementation Details
|
|
|
|
### Phase 1: Create New Column
|
|
|
|
```sql
|
|
-- Fast operation (metadata change only)
|
|
ALTER TABLE Transactions
|
|
ADD CNPJ_Payer_New VARCHAR(18) NULL;
|
|
|
|
-- Add temporary index to speed up lookups
|
|
CREATE NONCLUSTERED INDEX IX_Temp_CNPJ_New
|
|
ON Transactions(CNPJ_Payer_New)
|
|
WHERE CNPJ_Payer_New IS NULL;
|
|
```
|
|
|
|
**Estimated time:** ~1 second (independent of table size)
|
|
|
|
---
|
|
|
|
### Phase 2: Batch Migration (Core Strategy)
|
|
|
|
```sql
|
|
-- Migration script with phased commits
|
|
DECLARE @BatchSize INT = 100000; -- 100k records per batch
|
|
DECLARE @RowsAffected INT = 1;
|
|
DECLARE @TotalProcessed INT = 0;
|
|
DECLARE @StartTime DATETIME = GETDATE();
|
|
|
|
WHILE @RowsAffected > 0
|
|
BEGIN
|
|
BEGIN TRANSACTION;
|
|
|
|
-- Update batch of 100k records not yet migrated
|
|
UPDATE TOP (@BatchSize) Transactions
|
|
SET CNPJ_Payer_New = RIGHT('00000000000000' + CAST(CNPJ_Payer AS VARCHAR), 14)
|
|
WHERE CNPJ_Payer_New IS NULL;
|
|
|
|
SET @RowsAffected = @@ROWCOUNT;
|
|
SET @TotalProcessed = @TotalProcessed + @RowsAffected;
|
|
|
|
COMMIT TRANSACTION;
|
|
|
|
-- Progress log
|
|
PRINT 'Processed: ' + CAST(@TotalProcessed AS VARCHAR) + ' rows. Batch: ' + CAST(@RowsAffected AS VARCHAR);
|
|
PRINT 'Elapsed time: ' + CAST(DATEDIFF(SECOND, @StartTime, GETDATE()) AS VARCHAR) + ' seconds';
|
|
|
|
-- Pause between batches (reduces contention)
|
|
WAITFOR DELAY '00:00:01'; -- 1 second between batches
|
|
END;
|
|
|
|
PRINT 'Migration completed! Total rows: ' + CAST(@TotalProcessed AS VARCHAR);
|
|
```
|
|
|
|
**Configurable parameters:**
|
|
|
|
- `@BatchSize`: 100k (balanced between performance and lock time)
|
|
- Too small = many transactions, overhead
|
|
- Too large = prolonged lock, production impact
|
|
- `WAITFOR DELAY`: 1 second (gives time for other queries to run)
|
|
|
|
**Time estimates:**
|
|
|
|
| Records | Batch Size | Estimated Time |
|
|
|-----------|------------|----------------|
|
|
| 8,000,000 | 100,000 | ~2-3 hours |
|
|
| 90,000,000 | 100,000 | ~20-24 hours |
|
|
|
|
**Advantages:**
|
|
- Doesn't freeze application
|
|
- Other queries can run between batches
|
|
- Can pause (Ctrl+C) and resume later (WHERE NULL picks up where it left off)
|
|
- Real-time progress log
|
|
|
|
---
|
|
|
|
### Phase 3: Constraint Removal
|
|
|
|
```sql
|
|
-- Identifies all PKs and FKs involving the column
|
|
SELECT name
|
|
FROM sys.key_constraints
|
|
WHERE type = 'PK'
|
|
AND parent_object_id = OBJECT_ID('Transactions')
|
|
AND COL_NAME(parent_object_id, parent_column_id) = 'CNPJ_Payer';
|
|
|
|
-- Remove PKs
|
|
ALTER TABLE Transactions
|
|
DROP CONSTRAINT PK_Transactions_CNPJ;
|
|
|
|
-- Remove FKs (tables that reference)
|
|
ALTER TABLE Payments
|
|
DROP CONSTRAINT FK_Payments_Transactions;
|
|
```
|
|
|
|
**Estimated time:** ~10 minutes (depends on how many constraints exist)
|
|
|
|
---
|
|
|
|
### Phase 4: Column Swap (Renaming)
|
|
|
|
```sql
|
|
-- Rename old column to _Old
|
|
EXEC sp_rename 'Transactions.CNPJ_Payer', 'CNPJ_Payer_Old', 'COLUMN';
|
|
|
|
-- Rename new column to original name
|
|
EXEC sp_rename 'Transactions.CNPJ_Payer_New', 'CNPJ_Payer', 'COLUMN';
|
|
|
|
-- Change to NOT NULL (after validating 100% populated)
|
|
ALTER TABLE Transactions
|
|
ALTER COLUMN CNPJ_Payer VARCHAR(18) NOT NULL;
|
|
```
|
|
|
|
**Estimated time:** ~1 second (metadata change)
|
|
|
|
---
|
|
|
|
### Phase 5: Constraint Recreation
|
|
|
|
```sql
|
|
-- Recreate PK with new VARCHAR column
|
|
ALTER TABLE Transactions
|
|
ADD CONSTRAINT PK_Transactions_CNPJ
|
|
PRIMARY KEY CLUSTERED (CNPJ_Payer);
|
|
|
|
-- Recreate FKs
|
|
ALTER TABLE Payments
|
|
ADD CONSTRAINT FK_Payments_Transactions
|
|
FOREIGN KEY (CNPJ_Payer) REFERENCES Transactions(CNPJ_Payer);
|
|
```
|
|
|
|
**Estimated time:** ~30-60 minutes (depends on volume)
|
|
|
|
---
|
|
|
|
### Phase 6: Validation and Cleanup
|
|
|
|
```sql
|
|
-- Validate that 100% was migrated
|
|
SELECT COUNT(*)
|
|
FROM Transactions
|
|
WHERE CNPJ_Payer IS NULL OR CNPJ_Payer = '';
|
|
|
|
-- Validate referential integrity
|
|
DBCC CHECKCONSTRAINTS WITH ALL_CONSTRAINTS;
|
|
|
|
-- If everything OK, remove old column
|
|
ALTER TABLE Transactions
|
|
DROP COLUMN CNPJ_Payer_Old;
|
|
|
|
-- Remove temporary index
|
|
DROP INDEX IX_Temp_CNPJ_New ON Transactions;
|
|
```
|
|
|
|
---
|
|
|
|
## CNPJ Fast Process Customization
|
|
|
|
### Differences vs. Original Process
|
|
|
|
The original **CNPJ Fast** process was **restructured** for this client:
|
|
|
|
**Main changes:**
|
|
|
|
| Aspect | Original CNPJ Fast | Client (Customized) |
|
|
|---------|-------------------|---------------------|
|
|
| **Focus** | Applications + DB | DB only (no proprietary software) |
|
|
| **Discovery** | App inventory | Schema analysis only |
|
|
| **Execution** | Multiple applications | Massive SQL scripts |
|
|
| **Batch Size** | 50k-100k | 100k (optimized for volume) |
|
|
| **Monitoring** | Manual + tools | Real-time SQL logs |
|
|
| **Rollback** | Complex process | Simple (DROP COLUMN) |
|
|
|
|
**Reason for restructuring:**
|
|
- Client has no proprietary applications (only consumes data)
|
|
- 100% focus on database optimization
|
|
- Much larger volume than typical cases (100M vs ~10M)
|
|
|
|
---
|
|
|
|
## Tech Stack
|
|
|
|
`SQL Server` `T-SQL` `Batch Processing` `Performance Tuning` `Database Optimization` `Migration Scripts` `Phased Commits` `Index Optimization` `Constraint Management`
|
|
|
|
---
|
|
|
|
## Key Decisions & Trade-offs
|
|
|
|
### Why 100k per batch?
|
|
|
|
**Performance tests:**
|
|
|
|
| Batch Size | Time/Batch | Lock Duration | Contention |
|
|
|------------|-------------|---------------|-----------|
|
|
| 10,000 | 2s | Low | Minimal |
|
|
| 50,000 | 8s | Medium | Acceptable |
|
|
| **100,000** | 15s | **Medium** | **Balanced** |
|
|
| 500,000 | 90s | High | Production impact |
|
|
| 1,000,000 | 180s | Very high | Unacceptable |
|
|
|
|
**Choice:** 100k offers best balance between performance and impact.
|
|
|
|
---
|
|
|
|
### Why create column at END?
|
|
|
|
**SQL Server internals:**
|
|
- Add column at end = metadata change (fast)
|
|
- Add in middle = page rewrite (slow)
|
|
- For large tables, position matters
|
|
|
|
---
|
|
|
|
### Why WAITFOR DELAY of 1 second?
|
|
|
|
**Without delay:**
|
|
- Batch processing consumes 100% of I/O
|
|
- Application queries slow down
|
|
- Lock escalation may occur
|
|
|
|
**With 1s delay:**
|
|
- Other queries have window to execute
|
|
- Distributed I/O
|
|
- User experience preserved
|
|
|
|
**Trade-off:** Migration takes +1s per batch (~25% slower), but system remains responsive.
|
|
|
|
---
|
|
|
|
## Current Status & Next Steps
|
|
|
|
### Current Status (December 2024)
|
|
|
|
**Preparation Phase:**
|
|
- Discovery complete (100M records identified)
|
|
- Migration scripts developed
|
|
- Tests in staging environment
|
|
- Performance validation in progress
|
|
- Awaiting production maintenance window
|
|
|
|
### Next Steps
|
|
|
|
1. **Complete production backup**
|
|
2. **Production execution** (24/7 environment)
|
|
3. **Real-time monitoring** during migration
|
|
4. **Post-migration validation** (integrity, performance)
|
|
5. **Lessons learned documentation**
|
|
|
|
---
|
|
|
|
## Lessons Learned (So Far)
|
|
|
|
### 1. Previous Experience is Gold
|
|
|
|
Decision to use phased commits came from **practical experience** in previous projects, not from documentation or AI.
|
|
|
|
**Similar previous situations:**
|
|
- E-commerce data migration (50M records)
|
|
- Encoding conversion (UTF-8 in 100M+ rows)
|
|
- Historical table partitioning
|
|
|
|
---
|
|
|
|
### 2. "Measure Twice, Cut Once"
|
|
|
|
Before executing in production:
|
|
- Exhaustive tests in staging
|
|
- Scripts validated and reviewed
|
|
- Rollback tested
|
|
- Time estimates confirmed
|
|
|
|
**Preparation time:** 3 weeks
|
|
**Execution time:** Estimated at 48 hours
|
|
|
|
**Ratio:** 10:1 (preparation vs execution)
|
|
|
|
---
|
|
|
|
### 3. Customization > One-Size-Fits-All
|
|
|
|
The original CNPJ Fast process needed to be **restructured** for this client.
|
|
|
|
**Lesson:** Processes should be:
|
|
- Structured enough to repeat
|
|
- Flexible enough to adapt
|
|
|
|
---
|
|
|
|
### 4. Monitoring is Crucial
|
|
|
|
Scripts with **detailed progress logs** allow:
|
|
- Estimate remaining time
|
|
- Identify bottlenecks
|
|
- Pause/resume with confidence
|
|
- Report status to stakeholders
|
|
|
|
```sql
|
|
-- Log example
|
|
Processed: 10,000,000 rows. Batch: 100,000
|
|
Elapsed time: 3600 seconds (10% complete, ~9h remaining)
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Optimizations
|
|
|
|
### Optimizations Implemented
|
|
|
|
1. **Temporary index WHERE NULL**
|
|
- Speeds up lookup of unmigrated records
|
|
- Removed after completion
|
|
|
|
2. **Optimized batch size**
|
|
- Balanced between performance and lock time
|
|
|
|
3. **Transaction log management**
|
|
```sql
|
|
-- Check log growth
|
|
DBCC SQLPERF(LOGSPACE);
|
|
|
|
-- Adjust recovery model (if allowed)
|
|
ALTER DATABASE MyDatabase SET RECOVERY SIMPLE;
|
|
```
|
|
|
|
4. **Execution during low-load hours**
|
|
- Overnight maintenance window
|
|
- Weekend (if possible)
|
|
|
|
---
|
|
|
|
**Expected result:** Migration of 100 million records in ~48 hours, without significant downtime and with possibility of fast rollback.
|
|
|
|
[Need to migrate massive data volumes? Get in touch](#contact)
|