Camelia, the Perl 6 bug

IRC log for #bioperl, 2011-03-17

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:33 bag_ joined #bioperl
00:42 ank left #bioperl
00:52 bag_ left #bioperl
01:37 mzgrideng joined #bioperl
02:22 splut joined #bioperl
02:23 perl_splut left #bioperl
02:47 CIA-57 bioperl-live: Robert Buels master * ref979d2 / Bio/Root/RootI.pm : improved _rearrange performance by 30% - http://bit.ly/hWOAd4
03:19 mzgrideng left #bioperl
03:20 rbuels dukeleto: ^^^^
03:20 rbuels (_rearrange is called many many times by 90% of bioperl in tight loops)
03:20 dukeleto rbuels: ooh, shiny
03:20 dukeleto rbuels: where is your benchmark script?
03:21 dukeleto rbuels: you should commit it as a benchmark test, so we can see if people make it slow again
03:21 rbuels dukeleto: my benchmark was me running nytprof a few times
03:21 rbuels dukeleto: you have a good benchmark-test module to recommend?
03:22 dukeleto rbuels: Tool::Bench
03:22 dukeleto rbuels: buy anything will do
03:23 dukeleto rbuels: https://github.com/notbenh/tool_bench
03:23 dukeleto rbuels: not sure if it has hit CPAN yet
03:23 rbuels dukeleto: how could i normalize the benchmark for people's machines in such a way that it would show if somebody made it slower?
03:23 rbuels dukeleto: have a reference implementation?
03:23 rbuels dukeleto: (in the test)?
03:25 dukeleto rbuels: i am just saying to have the script, it doesn't necessary need to pass or fail
03:25 dukeleto rbuels: but having something where people can say "this took X seconds on this commit by Y seconds on that commit" is valuable
03:26 dukeleto rbuels: perhaps a benchmarks/ directory is needed
03:26 dukeleto rbuels: something akin to https://github.com/parrot/parrot​/tree/master/examples/benchmarks
03:28 dukeleto rbuels: anyway, nice work
03:28 dukeleto rbuels: have you thought of writing _rearrange in XS ? ::ducks::
03:32 * rbuels rolls his eyes
03:51 zenman_ joined #bioperl
04:19 sheenams joined #bioperl
04:22 CIA-57 bioperl-live: Robert Buels master * r108312d / Bio/SeqFeature/Generic.pm : improved Bio::SeqFeature::Generic::has_tag performance by about 30% - http://bit.ly/gED7QH
04:22 CIA-57 bioperl-live: Robert Buels master * r68a5215 / Bio/Location/Atomic.pm : fix big performance regression in which _load_module was getting called every time(!) a new Bio::Location::Atomic (and subclasses) was created - http://bit.ly/gfsDjq
04:26 rbuels Devel::NYTProf is wonderful.   (as if it needed more praise)
04:27 rbuels sheenams: welcome to the matrix.
04:27 * rbuels chuckles
04:27 sheenams haha....don't make me choose a pill color.
04:28 rbuels sheenams: oh no, by starting your irc client you have already swallowed the red one.
04:28 sheenams oh ok. well at least i didn't have to google which one I wanted. I don't remember the movie that well
04:29 rbuels sheenams: i'm actually just now headed out, but we'll skype tomorrow
04:29 sheenams ok. i'm working on getting skype to function in ubuntu. the mike volume is incredibly low and I can't seem to change it.
04:29 rbuels sheenams: i don't know if pyrimidine will be there or not (pyrimidine == chris )
04:30 sheenams well I'm sure you'll be able to answer a bunch of my questions anyways.
04:30 sheenams do you know dan bolser?
04:30 rbuels sheenams: sure of course.
04:30 rbuels (i know dan)
04:30 rbuels sheenams: he's in this channel now, dbolser
04:30 sheenams i tried to message him about gsoc but he didn't get back to me. have an email address I can send it to?
04:31 rbuels sheenams: did he say something odd to you about "talk to duke"?
04:32 sheenams no. i mentioned that duke told me to talk to him...
04:32 rbuels everything is become clear to me now.
04:32 rbuels dukeleto: ^^^^^
04:34 rbuels sheenams: in more recent ubuntu, you might need to use pavucontrol to adjust skype's volume levels
04:34 rbuels sheenams: or the system volume control
04:34 rbuels sheenams: especially if you are using an external headset
04:35 sheenams i'll work on that. no external headset so I have to rely on the built in mike (i know, i lose nerd points for that)
04:39 * rbuels finds skype really annoying
04:40 sheenams agreed. even with input volume at 150% its really quiet.
04:40 sheenams windows 7 it is for skype (and onenote)
04:42 sheenams it works if i yell at it....
04:42 rbuels sheenams: we could use google voice if you can get that working better
04:42 rbuels sheenams: that has a linux client too
04:42 sheenams thats what i use for most chats
04:48 sheenams google voice/video chat is good to go in ubuntu.
04:52 * rbuels works on it
05:02 rbuels sheenams: ok, i think i have it working on my end also
05:03 * rbuels goes to bed
05:03 sheenams night
05:03 CIA-57 bioperl-live: Robert Buels master * re40cfe6 / (3 files in 3 dirs): add test for leading whitespace before exonerate query, fix to work with old exonerate format (you should run the tests hyphaltip!) - http://bit.ly/eNZSxO
05:53 sheenams left #bioperl
07:18 bag_ joined #bioperl
08:26 bag_ left #bioperl
09:42 dbolser hi sheenams... IRC is like email, except when you log off, I've no frickin idea how to contact you again...
09:44 dbolser anyone got an email for sheenams? I'll lurk while the world turns.
09:45 dbolser rbuels: good work with the performance boosts
10:12 cassj joined #bioperl
12:16 zenman_ left #bioperl
13:07 alxsi joined #bioperl
13:07 rbuels had a nice night last night with Devel::NYTProf
13:29 melic left #bioperl
13:41 philsf joined #bioperl
13:45 pyrimidine rbuels: wondered whether you started profiling.  Devel::NYTProf is nice!
13:46 rbuels pyrimidine: yeah, i had nytprof out cause i was playing with the refactored ace.pm
13:46 rbuels (rewritten, really)
13:46 rbuels so much better
13:46 driley_ is now known as driley
13:47 pyrimidine yeah, there are defin. parts that could use a good profiler, others that just need to be simpler in design
13:47 pyrimidine ***cough***Locations***cough***
13:47 rbuels yeah.
13:48 rbuels pyrimidine: well, you saw that Locations thing i did
13:48 pyrimidine rbuels: yep
13:48 rbuels pyrimidine: that cut 5 sec off of a 33-second runtime
13:48 pyrimidine rbuels: did you see my branch commit?
13:49 * rbuels looks
13:49 pyrimidine https://github.com/bioperl/bioperl​-live/tree/topic/cached_locations
13:49 rbuels iiiinteresting
13:50 pyrimidine the problem is, any parsers need to reset the cache
13:50 pyrimidine but there is a decent speedup for large seqs
13:51 rbuels pyrimidine: what parsers use from_str
13:51 dbolser ace++
13:52 dbolser For feature intersections I'm using Set::IntRange
13:52 dbolser quite fast
13:53 pyrimidine rbuels: There is a Bio::Factory::FTLocationFactory in Bio::SeqIO; I think GenBank and EMBL use that for generating locations, and Bio::SeqIO::FTHelper does other bits of gruntwork
13:53 pyrimidine (though FTHelper should just go away, it's extraneous)
13:53 rbuels pyrimidine: seems like the caching should be something that's handled by the parsers that are creating locations
13:54 rbuels (creating them willy-nilly)
13:54 alxsi can someone explain me this line please? my @cds = grep { $_->primary_tag eq 'CDS' } $seq->get_SeqFeatures;
13:54 pyrimidine rbuels: thought about that. but then each parser is implementing an independent caching system
13:54 rbuels alxsi: do you know perl?
13:54 pyrimidine s/is/would be/
13:54 rbuels pyrimidine: yeah it is, but caching under the covers is a Bad Idea
13:55 rbuels pyrimidine: you could make a LocationCacher i suppose
13:55 rbuels pyrimidine: so they only need to be poked a *little* to add the caching
13:55 pyrimidine rbuels: yeah, that's probably a better idea
13:55 rbuels but caching magically inside the location stuff ...
13:55 rbuels that'll lead to trouble
13:56 pyrimidine rbuels: or, something that wraps the LocationFactory and specifically implements caching
13:56 rbuels that would work too
13:56 pyrimidine rbuels: the caching is settable, and off by default
13:57 rbuels pyrimidine: make sure the cache doesn't start getting huge
13:57 pyrimidine rbuels: though I like the wrapper idea more (decouples the caching behavior from the everything else)
13:58 pyrimidine rbuels: right; the parser would be responsible for resetting the cache
13:58 pyrimidine dbolser: have you tried any of the SQLite RTree stuff?
13:58 dbolser alxsi: rbuels means you should lookup grep
13:58 dbolser pyrimidine: no, is it positional indexing?
13:58 pyrimidine dbolser: spatial indexing
13:59 dbolser oh right
13:59 pyrimidine I just added a patch that made it into CPAN in the latest DBD::SQLite release
13:59 pyrimidine *dev release
13:59 alxsi yes I looked it up, but I don't really understand it's use
13:59 pyrimidine http://search.cpan.org/~adamk/DBD-SQLite-1.32_02/
14:00 rbuels alxsi: it filters a list, giving the results as another list
14:00 rbuels alxsi: just like the shell command 'grep' filters a stream
14:01 dbolser alxsi: my @values_equal_to_1 = grep{ $_ == 1 } @values;
14:01 pyrimidine my @filtered_sfs = grep {$_->has_tag('CDS') } $seq->get_SeqFeatures;
14:01 alxsi so this gets all the values from @values that == 1?
14:02 dbolser pyrimidine: using Set::IntRange within feature sets is more general than using SQLite though...
14:02 dbolser alxsi: yup, no tvery usefull, but yeah
14:02 dbolser you get a list of 1s at best...
14:02 dbolser but in the bp example, you get all objects where the tag is CDS
14:03 dbolser primary_tag that is
14:03 dbolser l8r
14:03 alxsi ok yes i got it thanks for clearing that up
14:04 pyrimidine dbolser: does it work directly across specifically identified ranges (like chromosomes)?  For instance, would it differentiate 1..10 on chr1 vs 1..10 on chrX?
14:05 pyrimidine I know that you can do a prelim test on that, but DBD::SQLite's RTree can do that directly, with a bit of magic
14:06 pyrimidine dbolser: just curious on that bit; I'm working on a generic non-bioperl backend that abstracts that away, so a pure perl solution would be nice
14:07 pyrimidine (mysql and Pg also implement RTree, though Pg has it in a separate module)
14:10 dbolser pyrimidine: no, Set::IntRange is very crude
14:10 dbolser low level
14:10 dbolser in fact I'm not even sure if it would be efficient on chromosome scale feature sets
14:11 pyrimidine dbolser: that's okay.  Just so you know, there are a few other pure perl-based modules that do the same thing, just not sure of their performance
14:11 dbolser right
14:11 pyrimidine dbolser: apparently, DBD::SQLite is used by a number of devs who work with SNPS, can handle millions of features
14:12 pyrimidine and the RTree implementation is supposed to scale better than binning and other schemes (acc. to Jim Kent's BigWig paper, which uses this)
14:16 mzgrideng joined #bioperl
14:37 alxsi left #bioperl
14:37 andrei_ joined #bioperl
14:38 andrei_ hey guys
14:39 andrei_ I was just wandering if when you extract a CDS sequence from .gbk file, and in the gbk is said that the sequence is complement, does bioperl reverse it when it extracts it?
14:40 andrei_ CDS             complement(26161..27204)
14:41 andrei_ this is how it looks in the benbank file
14:41 pyrimidine andrei_: if it is a feature, and you call '$sf->seq', it returns the sequence based on the location, so yes
14:41 andrei_ ok thanks
14:43 pyrimidine andrei_: just a small gotcha: if you call $sf->seq and the location is split (e.g. feature has multiple locations with a join, like exons) then you should call spliced_seq()
15:01 andrei_ yes I know, but if the l0ocation is not split? can I still call sliced_seq()?
15:06 pyrimidine andrei_: yes, it doesn't hurt, but it is a bit slower for obvious reasons
15:36 CIA-57 bioperl-live: Chris Fields master * r97074c7 / MANIFEST : update manifest - http://bit.ly/fvttkc
15:39 pyrimidine oh, ffs.  The remote database changed back for the Map.t tests, failing again.  grrr....
15:45 pyrimidine rbuels: bioperl seems .... snappier
15:45 pyrimidine :)
15:46 * pyrimidine using the update meme : http://www.google.com/search?sourceid=chro​me&ie=UTF-8&q=%22seems+snappier%22
15:50 * rbuels chuckles
16:04 dbolser pyrimidine: thanks for that tip, I may be involved in a big snp project next year
16:09 andrei_ left #bioperl
16:10 philsf left #bioperl
16:11 cassj left #bioperl
16:33 philsf joined #bioperl
17:25 splut left #bioperl
17:25 perl_splut joined #bioperl
17:32 sheenams joined #bioperl
17:46 pyrimidine posts on Redmine and using OpenID for BioPerl wiki: http://news.open-bio.org/news/
17:46 * pyrimidine goes back to his regularly scheduled hacking
17:49 pyrimidine dukeleto: apparently redmine doesn't like the quotes in your name
17:50 pyrimidine dukeleto: the other email notification error was a bug with the bugzilla transition to redmine that I fixed
17:51 pyrimidine dukeleto: fixed the quoting thing as well, so you should be able to login
18:01 rbuels sheenams: is your google voice thing ringing?
18:20 dukeleto pyrimidine: yes, my quotes find lots of bugs in various places :)
18:20 dukeleto pyrimidine: thanks for fixing stuff
18:21 pyrimidine dukeleto: np
18:21 dukeleto pyrimidine: the redmine instance should really use SSL, or else we will tempt the Firesheep gods
18:23 pyrimidine dukeleto: I can check into that with Chris D, not sure what the apache setup is on that EC2 instance
18:24 pyrimidine I think it is configured for SSL though
18:29 pyrimidine ah, no it isn't.  will check with chris d then to see what is going on
18:35 dukeleto pyrimidine: thanks! we don't want anybody to hack our gibsons
18:38 pyrimidine dukeleto: posted to the powers that be.  should get a response soonish
18:46 * rbuels just finished talking to sheenams about the bioperl reorg
18:46 rbuels heh, whatever happens, she'll learn a lot!  i'm giving her a little pre-project to extract a module from the SGN codebase and CPAN it
18:46 dukeleto rbuels: good lord, you are mean
18:47 rbuels depends on what module I pick  :-)
18:47 sheenams dukeleto: don't egg him on! i'm a noob over here...
18:49 sheenams rbuels: just remember, you're the one I'll be asking to help me when i get stuck.
18:49 * rbuels chuckles
18:49 dukeleto sheenams: indeed. he is a glutton for punishment
18:49 rbuels sheenams: you've got pyrimidine too, don't forget
18:49 rbuels don't let that guy off too easy
18:49 * pyrimidine ducks
18:50 rbuels he's going to be benefitting a lot from a reorg
18:50 pyrimidine everyone will benefit from a reorg
18:50 mzgrideng left #bioperl
18:54 sheenams i'll make sure my email questions go to both of you. and maybe duke too for the fun of it.
19:08 dukeleto sheenams: oh boy!
19:11 pyrimidine :)
19:37 pyrimidine left #bioperl
19:37 pyrimidine joined #bioperl
19:55 rbuels pyrimidine: i've got some de novo assemblies of RNA-seq data to get some unigene-like things.
19:55 rbuels pyrimidine: for petunia, which has no genome right now.
19:55 rbuels pyrimidine: was done with MIRA, I only have a .ace file.
19:56 rbuels pyrimidine: need to put this ace in some kind of database, or index it or something.  what would you recommend?
19:56 rbuels pyrimidine: SAM/BAM seems to be for reference-based stuff, which this isn't ...
19:57 pyrimidine I think Florent recently hacked Bio::Assembly to store data in a Bio::DB::SeqFeature::Store
19:57 rbuels pyrimidine: yep i see that
19:57 rbuels pyrimidine: each Bio::Assembly::Contig has a seqfeature::store in-memory
19:57 pyrimidine ick
19:57 rbuels pyrimidine: well, it works OK ...
19:57 rbuels pretty fast too
19:58 pyrimidine sure, but that sort of defeats the purpose of Bio::DB::SeqFeature::Store
19:58 pyrimidine which is capable of storing lots of (uniquely-named) seqs and features
19:58 rbuels pyrimidine: but anyway, i want to make a page like this, but for an assembled bunch of RNA-seq: http://solgenomics.net/search/un​igene.pl?unigene_id=SGN-U444444
19:58 rbuels pyrimidine: particularly the "mRNA member sequences" part
19:59 rbuels (collapsed by default)
19:59 rbuels though i would probably only display the coverage part of that graphic by default
19:59 rbuels pyrimidine: so, i would have a detail page like this for each de-novo-assembled transcript sequences
20:00 pyrimidine rbuels: yeah, that sounds about right
20:00 rbuels pyrimidine: so i need to look up each contig's assembly someplace ...  so i'm trying to figure out where to put it
20:00 rbuels not in chado.
20:00 rbuels could index the .ace file ....
20:00 rbuels could maybe put it in a Bio::DB::SeqFeature::Store
20:01 rbuels or could i actually use a BAM for this?
20:01 pyrimidine I think the Store
20:01 pyrimidine BAM would also work, yes
20:01 pyrimidine each contig would be a reference sequence
20:01 pyrimidine that the reads map to
20:02 rbuels i'm thinking about integrating LookSeq at some point
20:02 rbuels seen it?
20:02 pyrimidine not yet
20:02 rbuels s'pretty nice
20:02 rbuels pyrimidine: http://www.sanger.ac.uk/cgi-bin​/teams/team112/lookseq/index.pl
20:02 rbuels lot better than gbrowse for looking at an assembly
20:03 rbuels pyrimidine: http://www.sanger.ac.uk/re​sources/software/lookseq/
20:03 rbuels (and also http://sourceforge.net/projects/lookseq/)
20:04 pyrimidine it's pretty nice!
20:05 rbuels sure is.  i was kind of shocked to find it today
20:05 rbuels it runs off of some kind of custom sqlite db
20:06 pyrimidine yeah, that's what I'm seeing as well
20:06 rbuels and of course they couldn't be bothered to ping anybody else about it, like GMOD or OBF, apparently
20:06 pyrimidine heh
20:06 pyrimidine not unusual, though
20:07 pyrimidine lots of reworked wheels, even within Sanger
20:08 rbuels well, i'm not aware of any web-based assembly viewer that's this sophisticated
20:08 rbuels are you?
20:08 pyrimidine not at this leve;
20:08 pyrimidine *level
20:09 pyrimidine good to see that it's still developed, bad to see that it has a bus factor = 1
20:09 rbuels hehe
20:10 pyrimidine sorry, 2
20:13 pyrimidine they are replicating some of the samtools efforts with bcftools
20:13 pyrimidine at least that
20:13 pyrimidine 's how it looks
20:13 pyrimidine http://samtools.sourceforge.net/mpileup.shtml
20:15 rbuels even more amazing, from the samtools man page: "Heng  Li from the Sanger Institute wrote the C version of samtools."
20:15 rbuels lol
20:15 rbuels maybe they don't actually overlap ...
20:15 * rbuels doesn't know much about this
20:15 pyrimidine dunno
20:16 pyrimidine though it does store variant call information, which is what bcftools is geared towards
20:21 CIA-57 bioperl-live: Chris Fields master * r54fb1e7 / t/Map/Map.t :
20:21 CIA-57 bioperl-live: Revert "volatile data broke test". Remote database changed data back.
20:21 CIA-57 bioperl-live: This reverts commit 4b69c5e4333990821def4a3a5db2b8645da71b6c. - http://bit.ly/hozENl
20:21 rbuels o_O
20:21 pyrimidine rbuels: Ensembl issues
20:26 CIA-57 bioperl-live: Chris Fields release-1-6-2 * ra037bef / : Merge branch 'release-1-6-2' of github.com:bioperl/bioperl-live into release-1-6-2 (+9 more commits...) - http://bit.ly/dTyGKb
20:28 sl33v3_ joined #bioperl
20:29 melic joined #bioperl
20:30 sl33v3 left #bioperl
20:30 sl33v3_ is now known as sl33v3
20:35 pyrimidine left #bioperl
21:02 * rbuels is getting pretty close to the point of writing Bio::Index::Ace
21:17 sheenams left #bioperl
21:47 philsf left #bioperl
22:53 perl_splut left #bioperl
23:25 deafferret Ventura Pet Detective
23:29 sheenams joined #bioperl
23:50 dbolser rbuels: BAM is quite nippy, And is the emerging standard for short read submission at SRA
23:51 dbolser night
23:52 dbolser (I'm off skiing tomorrow, so if I don't see you guys AgAin, good luck with this whole 'bio-perl'thing

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary