It does seem like it should be faster, but this isn't a vanilla export/import. H...

collyw · on Sept 24, 2014

What about importing the unstructured data into one table in Postgres and querying that table to populate a structured schema on the same Postgres server? It shouldn't be as IO bound, as its copying data from Postgres to another table rather than from Mongo. Can anyone tell me if that would likely bring a performance improvement?

_3u10 · on Sept 24, 2014

Probably, when dealing with IO bound problems it's important to think of IO as a really long pipe extremely fat pipe. Think oil pipeline.

The key is keeping the pipe full, requesting a piece of data based on what's in the pipe will mean that it's going to be hard to keep the pipe full. You want to request the data long before you're ever going to need it.

If your SQL queries are done in such a way that they don't depend on data in the pipe they'll work fast, if they aren't you'll be not much better off than whatever was currently in place.

allegory · on Sept 24, 2014

That's usually part of the ETL pipeline. If you look at SSIS and spring integration for example it has the ability to do this in the pipeline.

_3u10 · on Sept 24, 2014

You mean import the raw data into a json column and then SELECT DISTINCT INTO?

There are only 50 million rows... this sounds like a 5 minute query.

rjbrock · on Sept 24, 2014

I was assuming he was moving the json into a structured foreign key-> data type of relationship

zenogais · on Sept 24, 2014

It was actually a combination of several queries in the first database followed by several inserts into postgresql.

kermatt · on Sept 24, 2014

Ratio of query time vs insert time?

jerrysievert · on Sept 25, 2014

i'm curious as well - even if the data was only somewhat structured on the mongo side, an import into a single column table with JSON should be fairly quick with \COPY, then with a couple of indexes applied on your disparate data in the JSON data you should have been able to query that into a new table structure quite easily.

just wondering what the tradeoffs were in OP's eyes.