Originally posted at michaelzanggl.com. Subscribe to my newsletter to never miss out on new content.
Note: Some of these might also exist in other database systems.
utf8 is not actually utf8
Ever ran into this SQL exception? Incorrect string value: β\xF0\x9F\x98\x83 <β¦β for column...
You probably ran into the same trap as many others. Turns out mySQL's utf8 charset can only store around 6% of all possible unicode code points. This also excludes emoticons for example.
insert into emails(body) values ('π¦π¦π¦');
That's why you should use utf8mb4
instead.
Read more about it and how to migrate: https://mathiasbynens.be/notes/mysql-utf8mb4
comparing varchar_field
with false
or 0
Imagine the following contrived example
function normalizeEmail(email) {
if (!validate(email)) {
return false
}
return normalize(email)
}
// somewhere in the code
await User.where('email', normalizeEmail(email)).first()
The imaginary ORM would execute the query select * from users where email = <prepared value> LIMIT 1
If the validation inside normalizeEmail
was successful, the query would be select * from users where email = 'normalized.email@test.com' LIMIT 1
.
If the validation was not successful, the query would be select * from users where email = false LIMIT 1
.
Now go ahead and run something like select * from users where <varchar field> = false
in your database system.
Since the fields are not comparable, MySQL will convert one to the other, make them match and return you all users. Our ORM will simply pick the first user and continue the logic with that. π¬ Pretty dangerous.
The same happens with
field = 0
insert on duplicate key update
creates primary key holes
Imagine we have a table statistics
with the columns id
(AI), fkid
, title
.
INSERT INTO statistics (fkid, title) VALUES (1, 'first');
This will insert a new record with the id
1. Let's imagine there is some batch which always inserts or updates the title. It might execute the following queries over some time:
INSERT INTO statistics (fkid, title) VALUES (1, 'second') ON DUPLICATE KEY UPDATE title = 'second';
INSERT INTO statistics (fkid, title) VALUES (1, 'third') ON DUPLICATE KEY UPDATE title = 'third';
INSERT INTO statistics (fkid, title) VALUES (1, 'fourth') ON DUPLICATE KEY UPDATE title = 'fourth';
Finally we want to insert a record with a new fkid.
INSERT INTO statistics (fkid, title) VALUES (100, 'first') ON DUPLICATE KEY UPDATE title = 'first';
This inserts a new record in the table, but guess what the id
is? You'd might expect it to be 2, but in reality, everytime insert on duplicate key update
failed to insert and processed the update
, it incremented the auto increment internally. That means the id
of this record would be 5.
It's not too important that the IDs are really sequential, but you might wonder what's the cause of it.
Read more about it: https://stackoverflow.com/questions/38347110/prevent-innodb-auto-increment-on-duplicate-key
int(2) is not what you think it means
The length on INTs have no real meaning in MySQL, you can still insert values like 9999999. They only restrict how many characters should be displayed in the command line client. π€¨
Read more about it: https://stackoverflow.com/questions/5634104/what-is-the-size-of-column-of-int11-in-mysql-in-bytes
Are there any other traps you are aware of?
Top comments (2)
Amazing
This is a really informative article. Thank a lot!