Curated Content: Redis as a Database, DISTINCT problems, AWS EBS volume types, Figma scaling strategy and more  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏
SQL For Devs

Delete Duplicate Rows

After some time most application will have some duplicated rows resulting in bad user experience, higher storage requirements and less database performance. The cleaning process is usually implemented in application code with complex chunking behavior as the data does not fit into memory completely. However, a single SQL query can do the whole process, including prioritization of rows and the number of duplicates to keep.

MySQL
WITH duplicates AS (
  SELECT id,  ROW_NUMBER() OVER(
    PARTITION BY firstname,  lastname,  email
    ORDER BY age DESC
  ) AS rownum
  FROM contacts
)
DELETE contacts
FROM contacts
JOIN duplicates USING(id)
WHERE duplicates.rownum > 1
PostgreSQL
WITH duplicates AS (
  SELECT id,  ROW_NUMBER() OVER(
    PARTITION BY firstname,  lastname,  email
    ORDER BY age DESC
  ) AS rownum
  FROM contacts
)
DELETE FROM contacts
USING duplicates
WHERE contacts.id = duplicates.id AND duplicates.rownum > 1;
Indexing Beyond the Basics

Learn Everything About Creating Database Indexes You don't have to know all the complicated database internals to make a slow query fast. This book teaches simple rules for creating good indexes - enriched with dozens of illustrations to make all concepts more easy to understand. You have no excuse anymore for your slow queries!

SQL For Devs SPONSOR

Curated Database Content

Other people also write great educational content. So I hand-pick some new articles for this issue you may like. Have fun learning even more.

Redis as a Database Even though the functional scope of Redis lags behind a relational database, it can be used as a database. But it should only be used under specific restrictions.
medium‌.‌com
Mr DISTINCT might not be your friend While DISTINCT is a useful feature of SQL, it is also a very interesting one. Has it been used intentionally or to hide some bugs? It is not always clear without fully understanding the schema and the query's intention.
connor-mcdonald‌.‌com
Choose the best Amazon EBS volume type for your self-managed database deployment AWS provides many EBS volume types for different price, performance and durability semantics. It is difficult to select the best one, but this article explains their differences very well.
aws‌.‌amazon‌.‌com
The growing pains of database architecture Figma outgrew its single, large Amazon RDS instance in 2020 - constantly hitting its limits on peak load. Their detailed article shares their ideas and approaches to how they finally scaled to multiple servers.
figma‌.‌com
Relational Databases Explained Databases are easy to use but very complicated internally. This handy explanation shares the workings of indexes, the B+ trees that are used, transactions and the different issues each isolation level has.
architecturenotes‌.‌co
BIG DATA IS DEAD Many developers are planning for applications that support big data. But most applications will overcomplicate it as they perfectly fit on a single server.
motherduck‌.‌com
This message was sent to phps6dnxqhkm8i24@kill-the-newsletter.com.
If you don't want to receive these emails in the future, you can unsubscribe .