When it comes to SQL, one of the most unnatural things to do is to map arrays to a table, since arrays cannot be mapped into SQL columns. Let’s consider a very common case of a “tag cloud” of a blog, where every “Page” would have multiple tags associated with it. Back in my time doing my Bachelor in Computer Engineering at the University of Bologna, what we are going to explore would have qualified as a canonical exam exercise. So– let’s unpack it by considering this object:
When representing an object like "Page" with an array property "tags" in an SQL database, you will need to use a separate "Tags" table and establish a many-to-many relationship with a junction table. The "Pages" table stores page-specific information, while the "Tags" table contains unique tag names and IDs. The junction table “PageTags” connects pages to their tags using the respective IDs.
This approach allows for the efficient querying of pages by tags and simplifies tag management without altering the page structure, resulting in the following SQL:
-- Create the Page table
CREATE TABLE pages (
id INT PRIMARY KEY,
title VARCHAR(255)
-- Add other relevant attributes for a page
);
-- Create the Tags table
CREATE TABLE Tags (
id INT PRIMARY KEY,
text VARCHAR(50),
-- Add other relevant attributes for tags if needed
);
-- Create the PageTags junction table
CREATE TABLE PagesTags (
page_id INT,
tag_id INT,
PRIMARY KEY (page_id, tag_id),
FOREIGN KEY (page_id) REFERENCES pages(id),
FOREIGN KEY (tag_id) REFERENCES tags(id)
);
Platformatic DB can easily create an API on top of this approach, however, this process is slightly more verbose and requires quite a few calls to operate:
Introducing PostgreSQL Arrays
A few weeks ago, a Platformatic user– Mikeal Karon– asked us to add support for PostgreSQL Arrays, a feature I did not know about!
Thanks to PostgreSQL arrays, we can massively simplify our database schema:
-- Create the Page table with tags represented as an array
CREATE TABLE pages (
id SERIAL PRIMARY KEY,
title VARCHAR(255),
tags VARCHAR(255)[] -- Tags represented as an array of VARCHAR
-- Add other relevant attributes for a page
);
In this schema, the "tags" table includes a column named "Tags," which is defined as an array of VARCHAR(255). Instead of having a separate junction table for the many-to-many relationship between "pages" and "tags," the tags themselves are stored directly as an array within each "pages" entry.
For example, you can insert data into the "Page" table using tags like this:
INSERT INTO Page (Title, Content, CreationDate, Tags)
VALUES ('Page Title 1', 'Content of page 1', '2023-07-25', ARRAY['tag1', 'tag2', 'tag3']);
INSERT INTO Page (Title, Content, CreationDate, Tags)
VALUES ('Page Title 2', 'Content of page 2', '2023-07-26', ARRAY['tag2', 'tag4', 'tag5']);
This approach simplifies the schema and enables users to avoid the need for a separate junction table. However, it is important to keep in mind that using arrays for tags may have some limitations in terms of querying and indexing when compared to a traditional many-to-many relationship with a junction table.
As such, it is essential to consider your specific use case and data access patterns when choosing the representation for the "tags" property.
Given this schema, Platformatic DB will create a much simpler API.
This will also be significantly easier to operate:
Want to see how we built this feature? Check out our CTO Matteo Collina adding support for array columns in Platformatic DB, spread across part one and part two.
All of this is now available in Platformatic 0.32.0
How will you use this new functionality? Let us know on Twitter– or should we say, X!