Greetings from the tech trenches! I'm the CTO of Razi Title, Inc., a tech startup pushing the envelope of what's possible in our industry. As any tech team knows, maintaining a clean and reliable testing environment is an essential aspect of our daily operations. The ability to quickly set up, tear down, and refresh our testing data is critical for ensuring our software's quality and reliability. Today, I wanted to share some tricks we've developed using PostgreSQL stored procedures to facilitate this process.
In PostgreSQL, a stored procedure allows us to encapsulate and store complex SQL queries for later execution. In our case, we have created two stored procedures - `truncate_tables` and `copy_data` - that, when used in tandem, ensure we have a fresh and reliable testing environment ready for action.
Truncating Tables: `truncate_tables`
Our first stored procedure, `truncate_tables`, is a workhorse in our test data management toolkit. It loops through all tables in a given schema and truncates them, effectively wiping all data and preparing for a fresh test run.
Here is the code:
CREATE OR REPLACE PROCEDURE truncate_tables(schema_to text) AS
$$
DECLARE
_tbl text;
BEGIN
FOR _tbl IN
SELECT tablename FROM pg_tables WHERE schemaname = schema_to
LOOP
EXECUTE format('TRUNCATE TABLE %I.%I CASCADE', schema_to, _tbl);
END LOOP;
END;
$$ LANGUAGE plpgsql;
Once the old test data is purged, we use our second stored procedure, `copy_data`, to copy fresh data from our source schema to the test schema. This procedure is unique and somewhat nuanced, as it takes into account that the column order in the source and target tables might not be identical.
This difference in column order can occur even when both schemas have been generated by Django's `migrate` command, as Django does not enforce a specific column order. But fear not, our stored procedure handles this gracefully by explicitly specifying column names in both the `INSERT INTO` and `SELECT` statements. If a table does not exist in the target schema, it is simply skipped.
Here is the `copy_data` code:
CREATE OR REPLACE PROCEDURE copy_data(schema_from text, schema_to text) AS
$$
DECLARE
_tbl text;
_exists boolean;
table_arr text[];
i int;
BEGIN
-- Recursive CTE to find tables in order of foreign key dependency
WITH RECURSIVE fk_tables AS (
-- Find tables that have no foreign keys into them (base case)
SELECT
tb.oid,
tb.relname AS tablename,
array_agg(tb.relname) AS all_tables
FROM
pg_class AS tb
LEFT JOIN pg_constraint AS fk ON fk.confrelid = tb.oid
WHERE
tb.relkind = 'r'
AND fk.oid IS NULL
AND tb.relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = schema_from)
GROUP BY
tb.oid,
tb.relname
UNION ALL
-- Recursively find tables that have a foreign key into a table we've seen
SELECT
tb.oid,
tb.relname AS tablename,
all_tables || tb.relname
FROM
fk_tables
JOIN pg_constraint AS fk ON fk.conrelid = fk_tables.oid
JOIN pg_class AS tb ON fk.confrelid = tb.oid
WHERE
NOT tb.relname = ANY(all_tables)
)
SELECT array_agg(tablename) INTO table_arr FROM fk_tables;
FOR i IN 1 .. array_length(table_arr, 1)
LOOP
_tbl := table_arr[i];
EXECUTE format('SELECT EXISTS (SELECT 1 FROM pg_tables WHERE schemaname = %L AND tablename = %L)',
schema_to, _tbl) INTO _exists;
IF _exists THEN
EXECUTE (
SELECT
'INSERT INTO ' || schema_to || '.' || _tbl ||
' ("' || string_agg(column_name, '", "') || '") SELECT "' ||
string_agg(column_name, '", "') || '" FROM ' || schema_from || '.' || _tbl || ';'
FROM
information_schema.columns
WHERE
table_schema = schema_from AND table_name = _tbl
GROUP BY
table_schema,
table_name
);
END IF;
END LOOP;
END;
$$ LANGUAGE plpgsql;
These two procedures have become instrumental for us to quickly and effectively manage our test environment. For instance, we can schedule them to run before every major test suite, ensuring the latest data structure and content are in place, mimicking our production environment as closely as possible.
I hope that by sharing our approach, other teams may find inspiration or at least a starting point to address similar challenges in their development process. Happy testing!
Stay tuned for more insights from the front lines of startup tech. If you have any questions or thoughts, feel free to share them in the comments below!
Best,
Rob Zwink
CTO, Razi Title, Inc.