Skip Menu |

This queue is for tickets about the DBD-mysql CPAN distribution.

Report information
The Basics
Id: 121921
Status: resolved
Priority: 0/
Queue: DBD-mysql

People
Owner: Nobody in particular
Requestors: markus [...] wernig.net
Cc: pali [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: DBD-mysql-4.42.0 encodes binary blobs when storing
Date: Sun, 28 May 2017 15:19:20 +0200
To: bug-DBD-mysql [...] rt.cpan.org
From: Markus Wernig <markus [...] wernig.net>
Hi all I have an application that, among others, has to store PDF files in a mysql db. This has been running for almost 10 years now. After upgrading DBD-mysql to 4.42.0, the PDF files get corrupted when storing them to the db. It appears that they are somehow encoded in a character set (presumably utf8), even though the column definition is "mediumblob". 4.41.0 and earlier versions do not show that behaviour, a downgrade of DBD::mysql without any other changes restores the correct behaviour. Here is how the db is connected: my $dbh = DBI->connect(${dsn}, ${username}, ${passwd}, { RaiseError => 1, AutoCommit => 1, AutoInactiveDestroy => 1, mysql_auto_reconnect => 1, mysql_enable_utf8 => 1, } ) or die("DB connect failed: $DBI::errstr"); The code then goes on to insert data into tables like this one: +------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+---------------------+------+-----+---------+-------+ | id | bigint(20) unsigned | NO | PRI | NULL | | | name | text | NO | | NULL | | | data | mediumblob | NO | | NULL | | | doctype | text | NO | | NULL | | +------------+---------------------+------+-----+---------+-------+ like this: my $sql = "INSERT INTO $table (id, name, data, doctype) VALUES (?, ?, ?, ?)"; my $sth = $dbh->prepare($sql); # $data_in is the raw PDF file data $logger->log("File $name_in has " . length($data_in) . " bytes and hash " . sha256_hex($data_in)); $sth->execute($id_in, $name_in, $data_in, "application/pdf"); The log entry shows the correct size and hash, identical to what the file looks like on disk. May 28 14:54:17 dev middleware[26817]: File testdoc.pdf has 493392 bytes and hash 403da58f84328365c8bdb646bfa008f31b44f2c391dc5d40eefa6963bc49c991 But after retrieving the blob (either via the app or via CLI), the file size is much larger (700562), the hash is different, and the file is corrupt and cannot be opened by any reader. MariaDB [pdfdb]> select data from $table where id = 48 into dumpfile "/tmp/48.dump"; # ls -l /tmp/48.dump -rw-rw-rw- 1 mysql mysql 700562 May 28 14:56 /tmp/48.dump I've tried re-encoding the file (with vim) to latin1, which results in the original file size 493392, but still leaves the PDF corrupted. I assume that this is a bug in DBD::mysql. System: Gentoo Linux ~amd64, kernel 4.10.5-gentoo Perl 5.24.1 DBD::mysql 4.42.0 Thanks for looking into this. Markus
Download smime.p7s
application/pkcs7-signature 4k

Message body not shown because it is not plain text.

On Ned Máj 28 09:22:04 2017, markus@wernig.net wrote: Show quoted text
> Hi all > > I have an application that, among others, has to store PDF files in a > mysql db. This has been running for almost 10 years now. > > After upgrading DBD-mysql to 4.42.0, the PDF files get corrupted when > storing them to the db. It appears that they are somehow encoded in a > character set (presumably utf8), even though the column definition is > "mediumblob". > > 4.41.0 and earlier versions do not show that behaviour, a downgrade of > DBD::mysql without any other changes restores the correct behaviour. > > > Here is how the db is connected: > > my $dbh = DBI->connect(${dsn}, > ${username}, > ${passwd}, > { RaiseError => 1, > AutoCommit => 1, > AutoInactiveDestroy => 1, > mysql_auto_reconnect => 1, > mysql_enable_utf8 => 1, > } > ) > or die("DB connect failed: $DBI::errstr"); > > The code then goes on to insert data into tables like this one: > > +------------+---------------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +------------+---------------------+------+-----+---------+-------+ > | id | bigint(20) unsigned | NO | PRI | NULL | | > | name | text | NO | | NULL | | > | data | mediumblob | NO | | NULL | | > | doctype | text | NO | | NULL | | > +------------+---------------------+------+-----+---------+-------+ > > like this: > > my $sql = "INSERT INTO $table (id, name, data, doctype) > VALUES (?, ?, ?, ?)"; > my $sth = $dbh->prepare($sql); > > # $data_in is the raw PDF file data > $logger->log("File $name_in has " . length($data_in) . " bytes and hash > " . sha256_hex($data_in)); > > $sth->execute($id_in, $name_in, $data_in, "application/pdf"); > > The log entry shows the correct size and hash, identical to what the > file looks like on disk. > > May 28 14:54:17 dev middleware[26817]: File testdoc.pdf has 493392 bytes > and hash 403da58f84328365c8bdb646bfa008f31b44f2c391dc5d40eefa6963bc49c991 > > But after retrieving the blob (either via the app or via CLI), the file > size is much larger (700562), the hash is different, and the file is > corrupt and cannot be opened by any reader. > > MariaDB [pdfdb]> select data from $table where id = 48 into dumpfile > "/tmp/48.dump"; > > # ls -l /tmp/48.dump > -rw-rw-rw- 1 mysql mysql 700562 May 28 14:56 /tmp/48.dump > > I've tried re-encoding the file (with vim) to latin1, which results in > the original file size 493392, but still leaves the PDF corrupted. > > > I assume that this is a bug in DBD::mysql. > > System: Gentoo Linux ~amd64, kernel 4.10.5-gentoo > Perl 5.24.1 > DBD::mysql 4.42.0 > > > Thanks for looking into this. > > Markus > >
Duplicate of: https://rt.cpan.org/Public/Bug/Display.html?id=120953 https://github.com/perl5-dbi/DBD-mysql/issues/107 See also for more details: https://rt.cpan.org/Ticket/Display.html?id=25590 https://rt.cpan.org/Ticket/Display.html?id=60987 https://rt.cpan.org/Ticket/Display.html?id=53130 https://rt.cpan.org/Ticket/Display.html?id=87428