Subject: | DBD-mysql-4.42.0 encodes binary blobs when storing |
Date: | Sun, 28 May 2017 15:19:20 +0200 |
To: | bug-DBD-mysql [...] rt.cpan.org |
From: | Markus Wernig <markus [...] wernig.net> |
Hi all
I have an application that, among others, has to store PDF files in a
mysql db. This has been running for almost 10 years now.
After upgrading DBD-mysql to 4.42.0, the PDF files get corrupted when
storing them to the db. It appears that they are somehow encoded in a
character set (presumably utf8), even though the column definition is
"mediumblob".
4.41.0 and earlier versions do not show that behaviour, a downgrade of
DBD::mysql without any other changes restores the correct behaviour.
Here is how the db is connected:
my $dbh = DBI->connect(${dsn},
${username},
${passwd},
{ RaiseError => 1,
AutoCommit => 1,
AutoInactiveDestroy => 1,
mysql_auto_reconnect => 1,
mysql_enable_utf8 => 1,
}
)
or die("DB connect failed: $DBI::errstr");
The code then goes on to insert data into tables like this one:
+------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------------+------+-----+---------+-------+
| id | bigint(20) unsigned | NO | PRI | NULL | |
| name | text | NO | | NULL | |
| data | mediumblob | NO | | NULL | |
| doctype | text | NO | | NULL | |
+------------+---------------------+------+-----+---------+-------+
like this:
my $sql = "INSERT INTO $table (id, name, data, doctype)
VALUES (?, ?, ?, ?)";
my $sth = $dbh->prepare($sql);
# $data_in is the raw PDF file data
$logger->log("File $name_in has " . length($data_in) . " bytes and hash
" . sha256_hex($data_in));
$sth->execute($id_in, $name_in, $data_in, "application/pdf");
The log entry shows the correct size and hash, identical to what the
file looks like on disk.
May 28 14:54:17 dev middleware[26817]: File testdoc.pdf has 493392 bytes
and hash 403da58f84328365c8bdb646bfa008f31b44f2c391dc5d40eefa6963bc49c991
But after retrieving the blob (either via the app or via CLI), the file
size is much larger (700562), the hash is different, and the file is
corrupt and cannot be opened by any reader.
MariaDB [pdfdb]> select data from $table where id = 48 into dumpfile
"/tmp/48.dump";
# ls -l /tmp/48.dump
-rw-rw-rw- 1 mysql mysql 700562 May 28 14:56 /tmp/48.dump
I've tried re-encoding the file (with vim) to latin1, which results in
the original file size 493392, but still leaves the PDF corrupted.
I assume that this is a bug in DBD::mysql.
System: Gentoo Linux ~amd64, kernel 4.10.5-gentoo
Perl 5.24.1
DBD::mysql 4.42.0
Thanks for looking into this.
Markus
Message body not shown because it is not plain text.