Camelia, the Perl 6 bug

IRC log for #bioperl, 2010-05-21

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:35 kyanardag_ joined #bioperl
02:13 CIA-60 bioperl-live: Robert Buels topic/longest_orf * ra60b745 / (Bio/PrimarySeqI.pm t/Seq/PrimarySeq.t): refactored ORF finding in Bio::PrimarySeqI, in order to add capability of finding a sequence's longest ORF. tests not quite passing - http://bit.ly/d8YLgn
02:28 deafferret not I, sorry
02:34 deafferret hmm... /me tries rbuels' fancy gitk toy
02:35 deafferret in an attempt to understand this fangly situation
02:56 deafferret GitX++
03:01 deafferret oh, ha! the r in front of git revisions should be removed from the IRC bot  :)   re71c10c looks awfully like it starts with r  :)
03:06 deafferret git is fun. it makes me feel really stupid constantly   :)
03:10 deafferret my branch Jul 2009 -> this week looks pretty cool in GitX. Then especially how I merged it twice at the end
03:10 deafferret cluelessness++
03:32 CIA-60 bioperl-live: Jay Hannah master * r3198b27 / Changes : Simplifying further since there was no response to my BUGS question on the mailing list. - http://bit.ly/dtN2t5
04:21 cawiss joined #bioperl
04:36 kyanardag_ joined #bioperl
06:41 CIA-95 joined #bioperl
06:53 vinnana joined #bioperl
09:01 kd` joined #bioperl
09:01 kd` hello
09:02 kd` could someone announce the perl survey to the bioperl mailing list please.  The official announcement is here:  http://news.perlfoundation.org/
09:02 kd` oops, here: http://news.perlfoundation.org/2010/0​5/grant-update-the-perl-survey-1.html
12:14 kblin joined #bioperl
12:14 kblin hi folks
12:18 kblin I'm trying to parse HMMer3 hmmscan results using Bio::SearchIO and the parser reads in the data without throwing an error, but it fails to find any classification hits.
12:47 rdesfo joined #bioperl
13:02 rdesfo left #bioperl
14:09 cawiss joined #bioperl
14:38 vinnana left #bioperl
15:12 kyanardag_ joined #bioperl
15:40 spekki01 anyone that's used hmmer3 to search a sequence db know how to make it give some output to the terminal so you can tell its actually searching?
15:53 cawiss joined #bioperl
16:05 bag_ joined #bioperl
17:26 deafferret spekki01: so currently you have no output at all, but 100% CPU use?
17:27 rbuels spekki01: well you can tail its output files
17:27 rbuels spekki01: tail -f <output file>
17:27 rbuels spekki01: you can see what it's writing, if anything
17:35 spekki01 well i think it was hanging and not doing anything
17:35 spekki01 and yeah it was 100% usage
17:35 spekki01 heres what i think is wrong
17:36 spekki01 the file i got from NCBI was ther protein database nr.gz, after dl and unpacking. i look into it and it has weird end of line character all ther "^A" and i think hammer doenst recognice these and is trying to read the whole 6gig file as one line...
17:37 spekki01 so im attempting to write a quick shell command using awk or sed to change all of the ^A to \n
17:39 deafferret perl -p -i -e 's/\cA/\n/g' filename   ?
17:41 spekki01 i was gonna do sed '/s/^A/\n/g' test.txt but its not working
17:41 spekki01 lemme try what you have
17:41 deafferret ^A   is  "line begins with an A"
17:42 spekki01 oh
17:42 deafferret unless you're actually hitting Control-L Control-A there
17:42 spekki01 ok yours works way better
17:43 deafferret hmm, not control-L... apparently I've forgotten the key sequence in bash... ignore that part  :)
18:06 spekki01 deafferret: how would i change that script so instead of \n i wanted a space instead would it be somethign like perl -p -i -e 's/\cA/ /g' filename?
18:08 deafferret yup
18:08 deafferret perldoc perlrun
18:08 spekki01 kk tx
18:08 deafferret you probably want to start doing -i.bak or something
18:08 deafferret so it's not overwriting your file, it creates a new file
18:08 deafferret since you're not sure you're doing it right  :)
18:09 spekki01 yeah i have copies lol i dont need to spend another 2 hours dling the file
18:09 deafferret :)
18:10 deafferret i usually proof-of concept against head -1000 files
18:31 spekki01 k well im back to running my hmmer searches on my newly fixed protein db, but i still cant tell if it searching or just hanging. I did tail -f outputfile and it shows nothing, is there anyway to have hmmsearch do some kind of verbose output so i know it at least not just stuck on something?
18:31 spekki01 i went through the manual and i just ont see any options that output something to the terminal
18:41 deafferret strace should dump an infinite amount of garbage to your screen as it runs...
18:43 deafferret as far as anything useful goes, if the man page doesn't say anything you're probably stuck
18:44 deafferret if your same command works on smaller datasets, and your big set isn't running the system out of memory, then it'll probably finish eventually
18:44 deafferret but if the authors didn't hook it for "0.02% complete" output then I don't know how you'll know
18:45 deafferret hmms are a fad. just stall and no one will want the results  :)
18:45 deafferret like dna analysis  :)
18:46 spekki01 lol
18:46 deafferret next week someone will discover a molecule more interesting than dna, then we'll all jump on that new bandwagon
18:46 spekki01 well the command worked on a smaller database so im assuming its running
18:47 deafferret your database is just 1 massive file? or lots of files?
18:47 spekki01 the only difference between the databases was the larger one had all these ^A 's in it but i replaced those with spaces so it's in the same format as the smaller one
18:47 spekki01 one massive 6gig file
18:48 deafferret re: ^A: scary
18:49 spekki01 what does that ^A mean anyway is it a space character on a different system or something?
18:49 deafferret no clue
18:53 spekki01 i feel bad for running this search its eating up all the cpu
18:53 dnewkirk you can use the --cpu paramter to specify the number of cores to use
18:54 deafferret spekki01: leave me 1 cpu please
18:54 spekki01 lol
18:55 spekki01 i might have to set that, its been running 70 min on the servers and using all the cpu
18:55 dnewkirk leave him the pentium II
19:02 * dnewkirk realizes he has an odd sense of humor. decides to stop foisting it on the channel
19:07 spekki01 omfg did you see the google logo today lol
19:07 spekki01 playable pacman :)
19:15 deafferret lol  cool
19:15 deafferret beat level 1, hit a bug -- went right through a ghost somehow
19:15 deafferret :)
19:16 * deafferret goes back to work
19:24 spekki01 sigh running 2 hours 7 min and still nothing from my hmmer search, can it really take this long when going through a really big db file?
19:26 deafferret yup.
19:27 deafferret computers should generally be avoided
19:27 spekki01 maybe i should just go home and let it run over the weekend
19:27 * deafferret passes the bottle to spekki01
19:27 spekki01 lol
19:27 deafferret shh... I won't tell rbuels
19:39 splut joined #bioperl
20:30 kblin evening folks
20:31 spekki01 hi
20:31 * deafferret waves
20:34 kblin I'm trying to parse HMMer3 hmmscan results using Bio::SearchIO. the parser reads in the data without throwing an error, but it fails to find any classification hits
20:34 kblin is there any trick to it, or is the new format just not supported?
20:35 deafferret kblin: is the data you're trying to find in BioPerl definately in your results file?
20:37 kblin deafferret: yeah, I get results all right
20:37 kblin so loading the file works
20:38 deafferret no, I mean are your "classification hits" definately part of the results files? just bioperl isn't parsing them (apparently)
20:39 deafferret can you nopaste your code somewhere so we can see what you're trying to do?
20:39 kblin sure
20:41 kblin http://codepad.org/5rfWJ7AE
20:42 deafferret have a small result file snippet so I can see what your "classification hits" are?
20:42 kblin and the start of the hmmer results file looks like http://codepad.org/9FKP7I06
20:42 deafferret :)
20:43 deafferret I don't see 'class*' in that result file
20:43 deafferret c-Evalue ?
20:45 * deafferret sighs mightily in the general direction of $vendor in San Diego
20:49 kblin hm
20:50 kblin there should be a class* in the result file?
20:53 kblin ah, I see it in the hmmerpfam results of an old run :)
20:56 deafferret so the bioperl parser can't find it 'cause it's not there?  :)
20:58 kblin ok, so basically the bioperl parser doesn't support hmmer3 yet.. :)
21:01 deafferret i don't know squat about hmmer formats, so I don't know. I'm happy to help you debug it if you give me the info I need to reproduce your results
21:02 kblin I can dump a full result file somewhere, if that helps
21:03 deafferret well, I don't yet understand what you're expecting to parse that isn't parsing
21:04 deafferret all you've said is 'classification hits', but I have no idea what you mean by that
21:04 deafferret where, in the result file(s), are these items you want to parse?
21:04 kblin ah
21:05 kblin basically I'm expecting the contents of the "Scores for complete sequence (score includes all domains):
21:05 kblin "
21:05 kblin table to show up in the results object
21:05 kblin that's what happens when I parse the hmmer2 output
21:06 CIA-95 bioperl-live: Chris Fields master * rcbed980 / Bio/Tools/EUtilities/Link.pm : small doc fix, more to come - http://bit.ly/cy4DNQ
21:06 deafferret so you're saying bioperl appears to be ignoring lines 11-21?
21:07 kblin yeah
21:07 kblin the old file format looks like http://codepad.org/yMBVAP86
21:08 deafferret huh. ok, lemme give this a stab
21:08 * deafferret clocks out
21:10 deafferret what's the stands filename extension for these?
21:10 deafferret .hmmsearch   ? or...?
21:11 kblin out framework names them .hsp, but I don't know if that's the accepted standard
21:13 kblin ok, so the way I read Bio/SearchIO/hmmer.pm line 301, this looks like a check for the new output format
21:13 pyrimidine joined #bioperl
21:15 * rbuels waves at pyrimidine
21:15 * pyrimidine waves back
21:16 CIA-95 bioperl-live: Lincoln Stein lstein-seqfeature-store-summaries * r974abf3 / (8 files in 4 dirs): gene coverage stats seem to be working,,,basically - http://bit.ly/dezhO3
21:17 pyrimidine lincoln knows github
21:17 rbuels and football, and baseball .....
21:18 * rbuels dates himself
21:18 deafferret wow. deafferret thought about, then passed on that joke
21:18 rbuels deafferret: *you*?   *passed*? on a joke?
21:19 rbuels i'm shocked.  shocked!
21:19 deafferret purl: rbuels is also below the bottom of deafferret's joke barrel
21:19 deafferret barell?
21:19 deafferret barrell?
21:19 deafferret bear-ul
21:20 rbuels i always spelled it barayl
21:20 deafferret all wood gnomes do
21:20 rbuels dunno, i could be wrong
21:21 deafferret ogres tend toward bear-ul
21:21 kblin deafferret: so, the output of the Dumper call for the file I pasted dumps http://codepad.org/SDSG6yCU
21:21 * rbuels has no idea what deafferret is talking about
21:22 deafferret rbuels: oh, sorry. forgot you were in the wood gnome closet
21:23 deafferret "The server at github.com is taking too long to respond."
21:23 deafferret maybe we should host our own SVN server or something?
21:23 rbuels lol
21:23 deafferret kblin: I'm seeing the same thing. someday you will be able to see my code here   http://github.com/jhannah/​sandbox/tree/master/kblin
21:24 deafferret kblin: so next I was going to try to find that top block somehow
21:24 deafferret probably perl -d inside hmmer.pm or something
21:25 deafferret github.com/bioperl had a good run, but maybe it's time we gave the strategy a 2nd think
21:25 * deafferret runs and hides
21:25 kblin deafferret: I'm expecting http://codepad.org/QMX0xIF8 (cut a little)
21:25 deafferret github being down is funny for the first 60s or so. then... not so much
21:26 * deafferret wells up a little
21:26 deafferret kblin: k. hmm
21:29 deafferret looks like hmmer.pm hasn't been touches since 2008.  hmmer_pull since 2006
21:30 deafferret how old is this hmmer3 format?
21:30 kblin hm, hmmer3 has been released in 2009, but has been in development for a couple of years
21:32 kblin but I agree the fact that there's no "implement hmmer3 support" is a bit suspicious
21:32 kblin nasty that the new format seems close enough to the old hmmsearch format for the parser to not throw an error
21:33 deafferret if (/^Scores for complete sequences/o) {
21:33 deafferret has an extra 's' on the end   :/
21:34 kblin ah, durn
21:35 kblin ok, so both hmmer3 and hmmer2 have a hmmsearch binary
21:35 kblin the manual claims the hmmscan output (that's what I'm trying to parse) is the same as the hmmsearch output
21:35 kblin but probably that only holds in hmmer3
21:36 kblin and it seems like that subtly changed from the hmmer2 hmmsearch output
21:36 kblin so I misread the docs
21:36 deafferret $self->{'_reporttype'}  never gets set, so we're never going to hit the block of your interest...?
21:37 kblin hm, where's that actually coming from? :)
21:37 kblin I mean I know that parser works for the old format :)
21:38 deafferret end_element() sets it
21:40 kblin ah
21:40 kblin I take uc $1 uppercases $1? :)
21:41 deafferret ya
21:42 kblin ok. looks like I'll be spending some fun time writing a parser for hmmscan data myself then
21:42 kblin or rather, see if I can fiddle this into hmmer.pm
21:42 deafferret kblin: well, please fix bioperl  :)
21:42 deafferret add lots of tests
21:43 deafferret write a new hmmer.pm and we'll throw the old one out, assuming yours does everything
21:43 rdesfo joined #bioperl
21:43 pyrimidine there was someone working on a hmmer3 set f parsers on the list
21:43 deafferret or write hmmer3.pm, which would be better than bioperl having nothing
21:43 pyrimidine might be worth asking about it there
21:44 pyrimidine I'll see if I can dig up who it was...
21:44 * deafferret races pyrimidine
21:45 kblin pyrimidine: yeah, I susbcribed to the list earlier today, but my university likes to use graylisting, so the confirmation email always takes a while
21:45 pyrimidine Thomas Sharpton
21:45 kblin http://bioperl.org/pipermail/bio​perl-l/2009-February/029259.html <-tbat one?
21:46 pyrimidine no
21:46 rdesfo hello, I'm going into bioengineering and I was wondering if there where any projects I could participate in regarding bioperl?
21:46 deafferret http://article.gmane.org/gmane.comp.la​ng.perl.bio.general/21880/match=hmmer3   ?
21:46 kblin http://bioperl.org/pipermail/bio​perl-l/2010-February/032338.html
21:46 kblin got it
21:47 rbuels pyrimidine: i am emailing the search.cpan.org maints about that POD rendering thing
21:47 rbuels pyrimidine: i will CC bioperl-l
21:47 pyrimidine kblin: http://lists.open-bio.org/pipermail/​bioperl-guts-l/2010-May/031172.html
21:47 pyrimidine though, looking at that repo, there is nothing
21:48 deafferret shouldn't bioperl-dev stuff just be a branch in bioperl-live instead?
21:48 pyrimidine it should
21:49 kblin pyrimidine: that certainly _looks_ nice :)
21:50 kblin I'll email the author and ask, thanks
21:50 rbuels deafferret: yes.
21:50 spekki01 when using the nice command anyone know the default nice value that say a shell script would run at? i know its off topic.
21:50 deafferret rdesfo: there's gobs of bioperl bugs. wade in!  :)
21:51 deafferret rdesfo: github.com/bioperl
21:51 deafferret rdesfo: bioperl.org
21:51 rdesfo thanks I've been to bioperl.org
21:51 deafferret rdesfo: and/or do some of my work for me. I have ~80 tickets you could work
21:51 rdesfo :)
21:51 rbuels spekki01: i think it's 0
21:51 rbuels spekki01: on most systems ...
21:51 deafferret i'll go 80/20 on my stipend with ya
21:51 rbuels spekki01: #perl would be a better place to ask
21:52 deafferret :)
21:52 rbuels spekki01: same server as this channel
21:52 deafferret rdesfo: biodoc.ist.unomaha.edu/rt guest guest
21:52 spekki01 ok
21:53 deafferret kblin: let us know what Tom says. I don't see his code in bioperl-dev
21:54 deafferret kblin: i usually say what pyrimidine says. it just takes me 15X longer  :)
21:54 rdesfo deafferret: thanks
21:56 deafferret pyrimidine: where is this code?  http://lists.open-bio.org/pipermail/​bioperl-guts-l/2010-May/031172.html
21:57 kblin thanks for the help folks
21:58 kblin SIGWIFE, though, be back tomorrow :)
21:58 deafferret heh. welcome to the scramble! hope you can help  :)
21:58 deafferret ERUNLIKEHELL
21:59 pyrimidine deafferret: it isn't there
21:59 deafferret r16984
21:59 deafferret created a dir, which was later deleted?
21:59 pyrimidine svn ls svn+ssh://dev.open-bio.org/home/svn-​repositories/bioperl/bioperl-hmmer3
21:59 pyrimidine gets nothing
22:00 pyrimidine the svn repo is still there, just read-only
22:00 deafferret huh. ok. thanks.    /me clocks back in
22:02 rbuels pyrimidine: http://search.cpan.org/~cjfields/BioPerl-1.6.1/​examples/root/lib/Bio/PrimarySeqI.pm#translate  versus  http://search.cpan.org/~cjfields/BioPe​rl-1.6.1/Bio/PrimarySeqI.pm#translate
22:02 rbuels pyrimidine: i think this is actually what is causing brian's problem
22:03 rbuels pyrimidine: the former (which does not include lots of the POD on translate()) shows up first in the search.cpan.org results
22:03 rbuels pyrimidine: which is probably why Brian clicked on it
22:03 rbuels pyrimidine: is there a dev release or something that should be deleted?
22:04 * pyrimidine looking
22:05 pyrimidine I wonder if that was something that occurred during the POD cleaning.
22:05 pyrimidine had a POD checker run through the code at one point, which reported back bad spots
22:05 pyrimidine I fixed those
22:05 pyrimidine but maybe broke something?!?
22:05 pyrimidine rbuels: dev release?
22:06 rbuels pyrimidine: if you do a CPAN search for 'Bio::PrimarySeqI' you get a top hit that has [Developers] next to it
22:06 rbuels i'm not sure what that is
22:06 rbuels just guessing that it might have something to do with a dev release
22:06 rbuels or maybe it doesn't
22:08 pyrimidine I could delete those
22:09 pyrimidine odd, looking at those, not seeing the dev versions for bioperl core
22:10 pyrimidine Oh, I see what you mean
22:11 pyrimidine it's showing upp in the name
22:11 pyrimidine *up
22:11 pyrimidine Bio::PrimarySeqI [Developers] - Interface definition for a Bio::PrimarySeq
22:11 rbuels pyrimidine: yes, in the search hit
22:12 pyrimidine SeqI as well, but not all interfaces
22:12 pyrimidine Bio::Align::AlignI doesn't
22:12 rbuels pyrimidine: any idea where the erroneous one came from?
22:13 * rbuels writes a followup on the search.cpan.org rt ticket
22:13 pyrimidine Not sure; SeqI has it right in the POD
22:13 rbuels well, in the source that is being linked from those pages
22:14 rbuels seems to me, there are probably some wires crossed in the html generation or the versions of the packages that are uploaded, or both
22:14 pyrimidine PrimarySeqI doesn't though (for 1.6.1)
22:14 pyrimidine or 1.6.0
22:15 pyrimidine almost seems like a caching issue
22:15 rbuels pyrimidine: examples/root/lib ?
22:15 rbuels pyrimidine: does that ring a bell?
22:15 rbuels notice one has that in the page
22:15 rbuels er, in the path
22:16 pyrimidine yes, that's supposed to be an example file
22:16 rbuels um.
22:16 rbuels http://cpansearch.perl.org/src/CJFIELDS/BioPe​rl-1.6.1/examples/root/lib/Bio/PrimarySeqI.pm
22:16 rbuels there is an old version of PrimarySeqI included in the dist.
22:16 pyrimidine ok, that probably explains it, then
22:16 rbuels and search.cpan.org is indexing it
22:16 pyrimidine right
22:16 rbuels and it's the top hit!
22:17 deafferret !
22:17 pyrimidine yes, that's one oddity of the indexing there
22:17 rbuels depth-first maybe
22:17 pyrimidine the other is it will find the first module with BioPerl in the name and use that as the description of the repo
22:17 rbuels o_O
22:18 pyrimidine hence the reason we have BioPerl.pm
22:18 pyrimidine in the root dir
22:18 rbuels ah.
22:18 pyrimidine So, if you look up BioPerl, it will find that
22:19 pyrimidine 'Perl Modules for Biology'
22:19 pyrimidine instead of
22:19 pyrimidine 'Loader for LiveSeq from EMBL entries with BioPerl'
22:19 pyrimidine which comes from Bio::LiveSeq::IO::BioPerl
22:19 pyrimidine okay, gotta run
22:19 pyrimidine o/
22:19 deafferret o7
22:20 deafferret sudo rbuels /nick rbuels____ # so conversations line up correctly
22:22 rbuels pyrimidine: do those modules in examples need to be there?
22:22 rbuels pyrimidine: can we just delete them and re-upload a new dist?
22:22 rbuels pyrimidine: ok, i'm going to close the RT ticket then
22:24 rbuels sudo deafferret get a better irc client
22:25 deafferret tsk tsk. not a team player, that one.

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary