Basically you create docker container from some postgres image.
Then you run DDL scripts.
Then you stop this container and commit it as a new image.
And now you can create new container from this new image and use it for test. You can even parallelize tests by launching multiple containers. It should be fast enough thanks to docker overlay magic.
And it should work with any persistent solution, not just postgres.
You can save some time and complexity and just run a single container and first, once, set up your template database `my_template`, then create your testing databases using `CREATE DATABASE foo TEMPLATE my_template`. Basically TFA.
This will be much faster than restarting postgres a bunch of times, since this will just `cp` the database files on disk from the template to the new database.
The only "problem" is your application will need to switch to the new database name. You can also just put pgbouncer in front and let it solve that, if you want.
I would think the fundamental issue with this is similar to what the author described with template databases:
> However, on its own, template databases are not fast enough for our use case. The time it takes to create a new database from a template database is still too high for running thousands of tests:
And then in the timing shows that this took about 2 seconds. Launching another container is surely going to be at least that slow, correct?
So it's clear the author is trying to get an "absolutely clean slate" for each of potentially many tests. That may not be what all teams need, but I will say we had an absolute beast of a time as we grew our test suite that, as we parallelized it, we would get random tests failures for tests stepping on each other's toes, so I really like the approach of starting with a totally clean template for each test.
Starting each test with a totally clean template ensures consistent reproducibility, I’ll give you that, but it also isn’t real world, either. You only have the data that the test seeds which favors tests (and by extension business logic) written only with the “happy path” in mind. I think the smell for when tests are stepping on each other causing flakey runs, is that the logic being tested isn’t written for non-happy paths, or, the tests are asserting entirely too specific datasets or outcomes and anything else results in a failure. In the real world, other systems may very well be touching the same data or tables that your code is interacting with, so it being able to handle that kind of situation will produce a more fault tolerant system overall, which will serve to deliver value even if other systems go haywire and produce unexpected data you are looking at. Of course the need to handle that extra complexity is going to vary depending on business needs or other determining factors.
This is absolutely the correct answer. Testing infra should be ephemeral and mostly stateless. Prior to docker you had to figure out ways to mock the database or use something to approximate it with a lite weight DB like H2 or SqlLite.
With docker you can build out the test image with default usernames/passwords/etc...
Then as your install gets more complicated with stored procedures and the like you can add them to your test database and update local testing tooling and CI/CD to use that.
The massive benefit here is that you're using the exact same code to power your tests as you use to power your production systems. This eliminates issues that are the caused by differences between prod & test environments and anyone who's debugged those issues know how long they can take because it can take a really long time to figure out that is where the issue lies.
Basically you create docker container from some postgres image.
Then you run DDL scripts.
Then you stop this container and commit it as a new image.
And now you can create new container from this new image and use it for test. You can even parallelize tests by launching multiple containers. It should be fast enough thanks to docker overlay magic.
And it should work with any persistent solution, not just postgres.