Camelia, the Perl 6 bug

IRC log for #bioperl, 2010-11-15

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
06:48 bag__ joined #bioperl
08:04 bag__ left #bioperl
16:10 brandi joined #bioperl
16:10 brandi left #bioperl
16:45 dbolser hai
16:45 dbolser what is the name of that perl benchmark module
16:46 dbolser Benchmark
16:46 dbolser wiwi
16:46 dbolser wow
16:51 brandi1 joined #bioperl
16:51 brandi1 left #bioperl
16:52 dbolser hmm...
16:52 dbolser want to time a specific piece of code in an overall loop...
17:18 rbuels dbolser: http://search.cpan.org/~jesse​/perl-5.12.2/lib/Benchmark.pm
17:18 rbuels dbolser: or my $start = time;  my $elapsed = time - $start;  ;-)
17:19 pyrimidine joined #bioperl
17:19 rbuels dbolser: and Time::HiRes if you want milliseconds
17:26 dbolser rbuels: nope
17:26 dbolser fail!
17:26 dbolser Benchmark::Timer ++
17:26 dbolser :-)
17:26 dbolser 10 trials of yup (9.731s total), 973.135ms/trial
17:27 dbolser trying to prove that the BAM api can return the total number of reads within a reasonable time
17:28 pyrimidine dbolser: I just run a callout to samtools for that
17:28 dbolser pyrimidine: isn't that what use Bio::DB::Sam is?
17:28 dbolser 100 trials of count reads (62.974s total), 629.740ms/trial
17:29 pyrimidine dbolser: nope, those are the bindings to libbam.a
17:29 dbolser pyrimidine: got benchmark data?
17:29 dbolser http://bioperl.pastebin.com/hjCMp8bk
17:29 dbolser ;-)
17:30 dbolser Not sure if using the indexed interface will give me a boost here
17:35 pyrimidine dbolser: problem is, libbam.a doesn't include the BAM file stats code, so there is no way for Bio::DB::Sam to ask for that
17:36 dbolser pyrimidine: ahhh...
17:36 dbolser know if the java binding does?
17:36 pyrimidine here's the sub I use: http://bioperl.pastebin.com/HJxfwZ1G
17:36 dbolser ty
17:37 dbolser pyrimidine: thing is, I need this 'per contig'
17:37 dbolser or, 'per target sequence'
17:37 pyrimidine there is a quicker way, using 'samtools idxstats' I think
17:37 pyrimidine lemme try...
17:40 pyrimidine https://gist.github.com/700660
17:41 pyrimidine dbolser: ^^^^
17:41 dbolser pyrimidine: nice
17:41 pyrimidine dbolser: that's very fast, simple tab-based, easy to parse
17:41 dbolser yup
17:41 dbolser what are cols
17:42 dbolser does it look for the index and build one if not?
17:42 pyrimidine dbolser: ref seq name, seq length, # mapped reads, # unmapped reads
17:42 pyrimidine I assume it looks for the .bai file, yes
17:42 dbolser I think my index is broken :(
17:42 dbolser PGSC0003DMB000000001    7100477 0       0
17:42 dbolser PGSC0003DMB000000002    6562806 0       0
17:42 dbolser PGSC0003DMB000000003    4595448 0       0
17:43 pyrimidine dbolser: could always run samtools index on it
17:43 dbolser the perl version shows me otherwise
17:43 pyrimidine ah, but how long does it take?
17:44 dbolser I have an index for that file
17:44 dbolser file = some.bam
17:44 dbolser index = some.bam.bai
17:45 pyrimidine dbolser: is the BAM sorted?
17:45 dbolser I thought it was, but lets see
17:45 dbolser file is called .sorted.bam...
17:46 dbolser zcat ../Results/bowtie_combined.sorted.bam | head # BAM¿½{#@HD      VN:1.0  SO:unsorted
17:46 dbolser !
17:46 * dbolser checks his pipeline...
17:46 dbolser thanks for tip in any case
17:47 pyrimidine Most problems I had with SAM/BAM were generally b/c the file was not sorted as the last step
17:48 pyrimidine (particularly with Bio::DB::Sam, but that's interfacing with samtools code, so maybe it's related)
17:49 dbolser From my notes:
17:49 dbolser # Convert the SAM file to BAM in preparation for sorting
17:49 dbolser # Next, we sort the BAM file, in preparation for SNP calling:
17:49 dbolser silly pipeline!
17:55 pyrimidine dbolser: any reason you are using a custom pipeline for SNP calling?
17:56 dbolser samtools pileup -cv -f ?
17:56 dbolser is that custom?
17:56 pyrimidine ah, I thought you were using Bio::DB::Sam for that, my bad
17:57 dbolser no, just trying to prove a point to the Tablet developers
17:57 dbolser Tablet reads BAM, but refuses to show overall stats because the're "time consuming to calculate"
17:58 pyrimidine dbolser: something to bring up with the samtools devs, who could include that code with the samtools library but haven't added it in (yet)
17:59 dbolser pyrimidine: good point
17:59 dbolser sorting ain't speedy!
18:01 pyrimidine nope, it isn't
18:01 pyrimidine but it's needed to make access/stats speedy
18:07 dnewkirk left #bioperl
18:17 dbolser right
18:17 dbolser that solves the idxstats problem
18:17 dbolser the file is about the same size as my nominally 'sorted' bam, but its totally different
18:17 dbolser should prolly recall snps...
18:18 dbolser redo analysis
18:18 dbolser rewrite paper...
18:20 pyrimidine :P
18:20 dbolser and I was just about to go home!
18:20 dbolser ;-)
18:35 dbolser seems I'm calling a subset of the snps I called previously
18:40 pyrimidine dbolser: any idea why it's only a subset?
21:22 sl33v3 left #bioperl
21:23 sl33v3 joined #bioperl
22:46 bag__ joined #bioperl
23:29 bag__ left #bioperl

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary