Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I also agree that relying on database schema is a show stopper.

First counterpoint: As long as data type information isn't completely messed up, I can dump excel spreadsheets in a database, or dump database CSVs in a data lake, and start querying them right away using SQL with complex joins using auto-completion from the dataset alone.

Second counterpoint (harder to communicate): I work with a SaaS database (MS Dynamics/Dataverse) which doesn't provide direct SQL access and is not supported by common ORMs. Some of the data APIs require relationship schema information which invariably put a needle in attempts to generalize functionality, or just consume.

In this context, creating a simple in-memory test database or serializing records between modules, cannot possibly function without also knowing schema information if queries are going to make use of it. So now you need to carry schema information for everything, and load it at run-time from generated code or a live database, in back-end and front-end, just to interact with data where simple SQL would have worked just fine.

Conclusion: I dislike the proposal -- even if not breaking backward compatibility, it brings database-configuration details to SQL. SQL is imperfect and is already hurt by database-implementation details (like date functions) but it remains a beautiful expression of relational algebra and set theory - a query given the same data should return the same output, regardless of context. SQL is lingua franca for a reason.

The proposal feels like a fairly specific developer-centric extension and isn't where SQL should be headed, in my opinion.



> First counterpoint: As long as data type information isn't completely messed up, I can dump excel spreadsheets in a database, or dump database CSVs in a data lake, and start querying them right away using SQL with complex joins using auto-completion from the dataset alone.

In your example, all you have is data and no foreign keys, that's the show stopper? That means you have all the relationships in your head and that's how you can write complex joins right away? Sure, if that's the case, then you can't use foreign keys since you don't have any. Don't see how this would be a counterpoint though. There is nothing forcing you to use JOIN FOREIGN, you could just do what you describe. But I'm sure you are aware many databases have foreign keys for all relationships to enforce referential integrity. I should have mentioned in the proposal, the scope is limited to such databases.

I enjoyed your example though. I want to share a similar example. It happened to me at least a few times, I've had to deal with data, shipped as multiple CSV files, but without any schema at all. What I tend to do then is to quickly write a very loose data model with mostly text columns, to accept any values. Once the CSVs are in SQL, I can then clean up the data step by step, by inspecting the tables and converting the text columns to proper data types. Next, when suspecting some column(s) in some table seem to be referencing some other column(s) in some other table, based on the content of the columns in both tables, I then try to add a FOREIGN KEY with a suitable name between such column(s). If successful, we know there is referential integrity between the columns, and we know also have a name to describe such relationship. Win-win! Otherwise if the foreign key could not be created, I investigate what rows that only appear in the referencing table that are not present in the referenced table, using a NOT EXISTS (...) query. If the extra rows can safely be deleted, such as if e.g. forgetting to handle empty string values as NULL values, I can then try to create the foreign key again.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: