Camelia, the Perl 6 bug

IRC log for #bioperl, 2009-04-21

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
04:49 rbuels_ joined #bioperl
11:30 elTigre joined #bioperl
15:42 rbuels joined #bioperl
15:48 * deafferret waves
16:57 ende joined #bioperl
17:00 * ende waves
17:01 * deafferret tips his hat
17:11 ende so once a chan is registered to someone, I think it's impossible for it to expire
17:11 ende that sucks, I really wanted #diyg
17:11 ende for http://sourceforge.net/projects/diyg/
17:18 deafferret you can petition the ops channel, but they didn't help me get my Nick back from NickServ, even after 18 months -shrug-
17:18 ende they told me basically the same thing
17:18 ende btw anyone here frequent gmod useres?
17:18 ende seems like gmod could really use an irc presence
17:18 deafferret rbuels is I believe
17:19 rbuels yeah i work at sgn.cornell.edu
17:21 rbuels interesting, i didn't know about diyg
17:22 ende yeah, I think some folks in the gmod community might be interested in it
17:22 ende they just seem like they'd fit well together
17:22 rbuels ever been to any of the gmod meetings?
17:23 ende I haven't.  I was just asking brian osborne if he thought that might be a useful way to make inroads
17:24 rbuels oh you work with brian?
17:24 rbuels (or you were asking electronically)
17:24 rbuels ?
17:24 ende well, occasionally we're in the same office, but yeah electronically
17:25 ende we're the primary developers for diya (a microbial genome annotation pipeline)
17:25 rbuels what makes it microbial-specific?
17:27 ende primarily the particular selection of annotation tools that it was originally designed to use
17:27 rbuels hmmm
17:27 ende (although it's incredibly modular at this point, and most of those tools could be theoretically swapped out for more eukaryotic-specific ones)
17:28 rbuels where do you work, what was this developed as part of, etc,
17:28 ende navy medical research center
17:29 ende high throughput pathogen genomics sequencing lab
17:32 rbuels hmmm
17:33 rbuels what's the mission of the high throughput pathogen genomics sequencing lab
17:33 rbuels quick characterization of pathogens in the event of biowarfare?
17:36 ende Something like that.
17:37 rbuels how does this relate to ergatis?
17:38 rbuels http://gmod.org/wiki/TIGR-Workflow_/_Ergatis
17:39 * deafferret has been to one GMOD meeting a couple years back at CSHL
17:40 perl_splut http://www.cio.com/article/490044/S​GI_Asset_Sale_Completes_Unraveling_​of_Tech_Icon?source=nlt_cioinsider
17:42 rbuels well sun is headed downward as well
17:42 rbuels it's the inevitable commoditization of any maturing industry
17:43 rbuels and Sun and SGI have never been very well positioned to ride that out
17:43 rbuels IBM did it by changing mostly to be a consulting business
17:43 rbuels Sun at least has Java
17:43 rbuels SGI never did anything to dodge that inevitable bullet
17:49 ende rbuels: whats your involvement with gmod?
17:49 rbuels to this point, not much
17:49 rbuels mouthing off mostly
17:50 rbuels yeah, i guess that would characterize it
17:51 rbuels i mostly mouth off about integration
17:51 rbuels bioinformatics is so fragmented, because it's mostly academic
17:51 rbuels and academics don't really work together
17:52 rbuels these are of course gross generalizations
17:52 rbuels but i think that they hold true in a broad sense
17:53 ende I generally agree
17:54 deafferret It seems to me the most widely published seem to be the least cooperation minded.
17:54 ende I think part of the perceived necessity in 'pipelining' stems from the nature of bioinformatics software in general
17:54 deafferret those that cooperate a lot tend to publish a lot less
17:54 rbuels yes that is quite true jay
17:55 rbuels mostly because cooperative projects usually make glacial progress
17:55 rbuels because there is very little accountability in academia
17:55 rbuels my other pet peeve
17:55 rbuels ende: please elaborate
17:56 ende um.. inconsistent/incompatible I/O formats?
17:56 ende so the pipelining comes in..  glueing rigid bits together that do not fit naturally.
17:56 ende that's all I mean
17:57 perl_splut heheh
17:57 perl_splut or in my labs case where they just store the data in random forms in excel
17:57 rbuels yep, the i/o format thing is a specific consequence of the fragmented nature of bioinformatics
17:58 rbuels and it stays fragmented because cooperation does not usually yield much progress
17:58 rbuels biologists sure love excel.
17:58 perl_splut not sure about that. We do a lot of collaborative work here that seems to be making progress
17:59 perl_splut they might, but most don't seem to know anything about how to make it work for them
17:59 perl_splut 70 people in this lab, and only 3 of us know even a basic thing like click-drag the corner to get excel to autofill/increment in cells
18:00 ende I think projects just require time to mature
18:00 ende look at bioperl
18:00 ende I'd generally consider that a success
18:00 rbuels ....generally, yes
18:01 perl_splut yet bioC and biopython aren't nearly as big in terms of modules :)
18:01 ende one of the ideas behind DIYG in general is simplifying deployment by making use of various technologies
18:01 ende like virtualization
18:01 ende creating a 'genomics analysis pipeline' Virtual Appliance
18:01 rbuels biologist downloads vmware player, virtual appliance, runs analyses.
18:02 rbuels is that the vision?
18:02 ende yes
18:02 ende greatly reduces the amount of installation, version matching, database building, etc
18:03 perl_splut yep
18:03 rbuels what lab are you in perl_splut
18:04 perl_splut biodesign institute
18:04 rbuels and what are examples of the collaborations you guys do that make good progress
18:04 rbuels (got a url for biodesign institute?)
18:05 ende by the way, if any of you are interested in diya (the annotation pipeline) or diyg in general as a project consortium, we're very welcoming ;)
18:05 perl_splut biodesign.asu.edu
18:05 deafferret perl_splut: add yourself!  http://bioperl.org/wiki/IRC  :)
18:07 perl_splut two biggest is for microarray printing (techniques and technologies) and vaccine technologies with other labs here
18:07 rbuels deafferret: did you set up the bioperl wiki?
18:08 * deafferret blames ende
18:09 ende whatd I do?
18:09 rbuels set up the bioperl wiki i suppose
18:10 perl_splut nah. I prefer some anonymity :)
18:11 rbuels prefer to hang separately?
18:25 perl_splut something like that
18:43 perl_splut trying to recode genes so that they all contain unique sequences. Wonder if there is a faster way than regexp and perl code. The regexp is fast, but think the slowdown is in the actual recoding section
18:43 deafferret what does recode a gene mean?
18:44 perl_splut swap out codons for other codons that also code for that AA
18:44 deafferret ah, k
18:45 deafferret if you're dealing with many genomes at once I would think the "is this unique now?" part would be the slowest
18:45 perl_splut not genomes, just genes
18:45 deafferret "unique" across how many genes?
18:46 perl_splut however many fit in 25kb
18:46 perl_splut which results in about 8k regexp
18:47 deafferret aren't many single genes > 25kb?
18:47 perl_splut nope
18:47 perl_splut most of ours are in the 1.6kb range
18:48 perl_splut it takes about 8-10 hours to recode 7 genes using my current methodology
18:48 deafferret don't know how to help w/o seeing your source. github.com it or something?
18:48 perl_splut since I do a walking window to be sure of uniqueness
18:52 perl_splut here's the two main functions that do the recoding
18:52 perl_splut http://rafb.net/p/Z0R9z273.html
18:55 rbuels might want to profile it on a test set
18:55 rbuels to see where it's spending most of its time
18:56 rbuels the olde-timey perl profiling tools should be fine
18:56 rbuels http://www.perl.com/pub/a/​2004/06/25/profiling.html
18:56 rbuels i don't really see any recoding in the code here though
18:57 rbuels this looks like finding and removing restriction sites
18:58 perl_splut I use the same code for the various steps of recoding
18:59 perl_splut http://rafb.net/p/dW0PKd93.html
18:59 perl_splut what differs from a restriction site and any other kind of site is the pattern. There really isn't any other difference, heheh
19:01 deafferret Devel::NYTProf for the win
19:03 perl_splut thanks
19:45 deafferret seriously. it's super-sexy if you're not familiar with it.  :)
20:11 perl_splut whee... 36 packages to update
20:13 deafferret :)
21:04 driley joined #bioperl
21:40 perl_splut wow... that really slows down the code, heheh
21:40 deafferret perl_splut: sure, but the ratios are valid  :)
21:40 perl_splut good thing I'm only trying this with one 4kb gene
22:05 perl_splut seems that devel::nytprof has some idiotically hardcoded paths or something as it is trying to load lib/Posix.pm, but isn't finding it as it is doing ../../lib/Posix.pm... heheh
22:10 deafferret perl_splut: I hit a bug last year that was fixed by using SVN version instead of CPAN  http://code.google.com/p/perl-devel-nytprof/
22:11 deafferret no clue if that would help you or not in your circumstance
22:11 perl_splut version I have is 2.09 from Mar 29 '09
22:54 perl_splut hmm.. not sure how to read the results. the code spent far more time doing stuff than nyt seems to be saying...
22:55 perl_splut or is that time per call...
22:55 deafferret perl_splut: I don't remember the details, but I remember that slapped me in the face with several of my inneficiencies last year
22:57 perl_splut Profile of <code> for 484s, executing 34034430 statements and 4924431 subroutine calls in 155 source files and 52 string evals
22:58 perl_splut 14million statements were in the Bio::PrimarySeq module, heheh
22:58 deafferret that's a lot.  :)
22:59 perl_splut yeah, but the numbers don't seem to make sense to me
22:59 perl_splut so, not sure I'm seeing where my inefficiencies might be coming from
22:59 deafferret I remember bouncing back and forth between "oh wow, that's cool!" to "what the fuck is that?"
23:00 deafferret well, you could start at the top of your % time elapsed and see if you can speed up the heavy hitters somehow
23:02 perl_splut don't see that in the html output...
23:02 deafferret haven't used it in a while. if you want to tarball your results and upload them somewhere I could take a look
23:03 * deafferret kicks $vendor_pos repeatedly
23:05 perl_splut not sure I can as this reveals my whole code
23:05 deafferret shucks. my evil plan failed  :)
23:09 perl_splut http://innovationsinmedicine.org/nytdevel/
23:09 perl_splut that's just the index page and css
23:28 perl_splut hmm, looks like it spent 113s calling translate_as_string from the Bio::Perl module...
23:30 ende sourceforge is sweet
23:30 ende their hosted apps selection
23:30 ende not only Trac, which I can't live without, they even host wordpress now
23:30 ende development blogging, imagine that
23:30 ende http://apps.sourceforge.net/wordpress/diyg/

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary