A Python utility for migrating data between PostgreSQL databases.
datatrasnfer.py is a database migration tool that transfers data from a source PostgreSQL table to a target PostgreSQL table, with support for large datasets through chunked processing and comprehensive error handling.
- Chunked Data Transfer: Reads and transfers data in configurable chunks (default: 500 records) to handle large tables efficiently
- Cross-Database Support: Can transfer data between different PostgreSQL servers
- Transaction Management: Commits data per chunk with automatic rollback on errors
- Comprehensive Logging: All operations are logged to
migrator.logfor audit trail and debugging - Error Handling: Gracefully handles connection errors and insertion failures with detailed logging
- Column Preservation: Automatically detects and maintains column order from source table
- Python 3.6+
- PostgreSQL connection access to both source and target databases
- Dependencies listed in
requirements.txt
pip install -r requirements.txtpip install -r requirements-dev.txt# Run all tests with coverage report
pytest tests/ -v --cov=datatrasnfer --cov-report=html
# Run specific test file
pytest tests/test_datatrasnfer.py -v
# Run specific test class or method
pytest tests/test_datatrasnfer.py::TestGetConnection -v# Run flake8 linting
flake8 datatrasnfer.py --max-line-length=120
# Run pylint
pylint datatrasnfer.py
# Format code with black
black datatrasnfer.pyAfter running tests, view the HTML coverage report:
# Generated in htmlcov/index.html-
Configure Connection Credentials
Edit
datatrasnfer.pyand update thesource_confandtarget_confdictionaries with your database credentials:source_conf = { 'host': 'source_host', 'port': 5432, 'database': 'source_db', 'user': 'source_user', 'password': 'source_password', } target_conf = { 'host': 'target_host', 'port': 5432, 'database': 'target_db', 'user': 'target_user', 'password': 'target_password', }
-
Specify Source and Target Tables
Update the
migrate_table()call with your schema and table names:migrate_table( source_conf, target_conf, src_schema='source_schema', src_table='source_table', tgt_schema='target_schema', tgt_table='target_table', chunk_size=500 # Adjust as needed )
-
Run the Migration
python datatrasnfer.py
- chunk_size: Number of records to transfer per batch (default: 500). Reduce for memory-constrained environments, increase for better performance.
All operations are logged to migrator.log with timestamps and detailed error messages. Check this file to monitor migration progress or troubleshoot issues.
- Ensure the target table exists and has the same schema as the source table before running the migration
- The script currently continues to the next chunk on insertion errors; modify the error handling logic if you prefer different behavior
- Consider running on a test environment first to validate the migration process