In my limited experience, code built around relational databases is difficult to grok. The structure is inverted with respect to typical OOP. Rather than having a sane class hierarchy, where one can start at the top and navigate down to understand how objects are nested, the structure is inverted, piecing together class relationships requires looking at the tables and following foreign keys which effectively point to parent classes. It feels backwards.
> In my limited experience, code built around relational databases is difficult to grok. The structure is inverted with respect to typical OOP. Rather than having a sane class hierarchy, where one can start at the top and navigate down to understand how objects are nested, the structure is inverted, piecing together class relationships requires looking at the tables and following foreign keys which effectively point to parent classes. It feels backwards.
Other than your first sentence (which is subjective), you are correct. The only question is whether you are going to mangle the database structure to fit your OO hierarchy or design your program in a non-OO way to fit the relational structure.
Since the database will live on long past the program, and will have multiple programs talking to it, it makes sense to design your program around the data, not design the database structure around your program.
I think it’s worth noting that “database-centric” and “app-centric” notions of databases are both found in the wild. That’s how I remembered the difference between MySQL and PostgreSQL—MySQL was what happened when a bunch of app developers needed a database, and it was extremely popular e.g. with the PHP webdev crowd. If you needed to access the database, you went through the app. PostgreSQL is what happened when DBAs designed a database. If you needed access to the database, you connected to the database. A lot of other databases can be understood this way… like how MongoDB further shifts database concepts into the application.
(Honestly I’m definitely in your camp… if I need a database, design the database first, then write the code.)
You're right: object orientation and relational data don't mesh well together.
Where you're wrong, and it may take more experience to realize, is that the problems in the mismatch lie far more on the OOP side than on the relational side.
Object oriented designs - the navigable graphs of objects, leaving aside polymorphism - usually privilege a particular perspective of the data, a perspective which is suited to a specific application. For example, a store with items, customers and orders: the application is focused on creating transactions. Relational models, on the other hand, support multiple perspectives "out of the box" and are expected to support multiple applications; for example, reports that roll up by item, by category and by customer; reusing the same customers for a different application; and so on.
Different applications and perspectives usually require different object models. The alternate application reusing the same customer entities won't want to couple with a store application and decorate every customer with an irrelevant list of orders. A reporting app is better off with a visual query builder that creates SQL to hand off to the database than trying to iterate over object graphs performing aggregations and sorts. And so on.
For applications with ambitions for a longer lifespan, start with the database schema. Keep application models fairly thin; focus on verbs; try to ensure only one piece of code owns mutating a logical database entity with a set of verbs; and keep entanglements like UI out of those verb implementations so that you reuse the verbs, rather than being too tempted to reimplement direct database modifications or push too much logic into models.
Object oriented models come and go, applications are rewritten, but database migrations happen at a much slower pace, and data lasts for a very long time.
Class hierarchies are not "sane", they're inherently hard to extend in a way that preserves a consistent semantics. This is exactly what the relational model is intended to fix.
Foreign keys does not correspond to class hierarchies in OOP. If anything, they correspond to object composition.
E.g. you might have a Customer table and an Order table with FK from Order to Customer. This would correspond to a Customer object holding a collection of Order objects - it would not correspond to Order being a subclass of Customer.
In any case, the relational model is not supposed to map directly to any particular programming model or paradigm. You should be able to interact with relational data from OOP, imperative, functional and declarative languages.
> Foreign keys does not correspond to class hierarchies
If a foreign key is primary in two tables (1:1 relation), then it can correspond well to a hierarchical relation. In other cases, foreign keys can also correspond to composition as you say, and it depends on intention.
An example of inheritance could be the table Parties with primary key Party_ID. The table Customers has primary FK Party_ID (Parties.Party_ID). If you must use ORM, you can implement this is as Customer extends Party.
The class Customer then inherits whatever ability the class Party has to access further tables with FK Party_ID, ex something like PartyAddresses.
This might make more sense if you also have a table Providers with primary FK Party_ID (Parties.Party_ID), and the table Orders with FKs provider_Party_ID (Providers.Party_ID) and customer_Party_ID (Customers.Party_ID).
Normalizing like this allows you to have inheritance, polymorphism, and encapsulation in OOP.
You can build data models that correspond pretty well to OOP.
Lets say you have a superclass named Person, with primary key person_pk. You can then, in the data model, implement polymorphism by creating a sub-class-table, for instance CustomerPerson, by giving it the same person_pk as primary key, where the same person has the same value for the primary key.
In this case, every person should be in the Person table, but only customers should be in the CustomerPerson table.
If person is implemented in your OOP by inheritance, you can leave shared fields (variables) in the Person table, and only add additional fields to the CustomerPerson table. Your program can then instantiate Person objects by simply using the Person table, and CustomerPerson objects by joining by person_pk.
If Person is just an abstract interface, you may want to have it only contain the primary key, and have all fields in the subclass CustomerPerson table. In this case, you could join CustomerPerson directly with some address-table, for instance, by person_pk, without going through the superclass (Person).
(And if you need it to also generalize to companies, you could have both Company and Person be subclasses of a class named Subject, and use subject_pk as primary key in Subject, PersonSubject, CompanySubject, PersonCustomerSubject and CompanyCustomerSubject and even CustomerSubject)
The approach above works better in my experience, than having subclasses have their own primary keys, especially if you read more than write to the db, as it can drastically reduce the number of joins you need to do, and because it prevents one-to-many relationships between subclass and superclass. (A limitation would be if you are not able to recognize a shared primary key across subclasses when creating an object.)
Postgres actually has a concept of inherited tables to make this pattern explicit in the DDL. Although I’m not sure the PK uniqueness constraint is enforced.
There's your problem. No such thing as a sane class hierarchy, but thinking there is leads you down a dark path where relational modelling looks "difficult to grok". Rather the mental model you've trained on hierarchical OO code is the awkward one, and it's ultimately a dead end.
One part of the relational model is that all metadata itself is available as tables. So there is a table containing the foreign key relations. You can write a query that gives, for each table, which foreign keys point to it. You're not required to looking at the default view.