CC: | fg_cur [...] ebi.ac.uk |
Subject: | measurement unit overwritten when measurements have same numeric value but different units |
Date: | Thu, 03 Apr 2014 09:04:02 +0100 |
To: | bug-Bio-MAGETAB [...] rt.cpan.org |
From: | Maria Keays <mkeays [...] ebi.ac.uk> |
Dear Tim,
When using Bio::MAGETAB we have noticed that in the case where two
measurements have the same numeric value but different units (e.g. "2
micromolar" and "2 nanomolar"), only one measurement object is created
for the numeric value, and the unit assigned to this object has the
value of whatever the last unit assigned to that numeric value in the
SDRF was.
E.g. in a "FactorValue[dose]\tFactorValue[unit]" column pair, if there
is a row with "2\tmicromolar" followed by a row with "2\tnanomolar",
only one measurement object with value "2" containing a unit object with
value "nanomolar" is created. Conversely, if "2\tmicromolar" is second,
the unit object has value "micromolar".
The problem goes away if we replace "2\tmicromolar" with
"2000\tnanomolar", with two measurement objects being created, so there
is a work-around in this case. I am not sure a work-around like this
would be sensible for all cases though, for example "2 picomolar" vs. "2
molar" or something like that.
Please see the attached files for information:
* E-MEXP-3577.nanomolar.sdrf.txt : SDRF with "2\tnanomolar" last.
* E-MEXP-3577.factorValues.nanomolar.txt : factorValue objects from
Data::Dumper using SDRF with "2\tnanomolar" last.
* E-MEXP-3577.micromolar.sdrf.txt : SDRF with "2\tmicromolar" last.
* E-MEXP-3577.factorValues.micromolar.txt : factorValue objects from
Data::Dumper using SDRF with "2\tmicromolar" last.
* E-MEXP-3577.fixed.sdrf.txt : SDRF with "2000\tnanomolar" instead of
"2\tmicromolar".
* E-MEXP-3577.factorValues.fixed.txt : factorValue objects from
Data::Dumper using SDRF with "2000\tnanomolar" instead of "2\tmicromolar".
* E-MEXP-3577.idf.txt : IDF file required to parse MAGE-TAB (copying
relevant SDRF to E-MEXP-3577.sdrf.txt for each test).
The code I used to generate the E-MEXP-3577.factorValues.* files is as
follows:
---------------------
use Bio::MAGETAB::Util::Reader;
my $idf_filename = "E-MEXP-3577.idf.txt";
my $reader = Bio::MAGETAB::Util::Reader->new({
idf => $idf_filename,
relaxed_parser => 1,
ignore_datafiles => 1,
});
my $magetab = $reader->parse;
use Data::Dumper;
foreach my $factorValue ($magetab->get_factorValues) {
print Dumper($factorValue);
}
---------------------
We are using Bio::MAGETAB version 1.27, on Perl 5.8.8, on Linux (Cent OS).
Thanks!
Maria Keays
--
Maria Keays
Functional Genomics
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom
Tel: +44 (0)1223 494546
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.