Camelia, the Perl 6 bug

IRC log for #bioperl, 2010-05-17

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:47 cawiss joined #bioperl
06:42 vinnana joined #bioperl
08:15 kps joined #bioperl
09:47 Lynx__ joined #bioperl
10:19 JunY joined #bioperl
10:23 JunY Hi, deafferret. Have you solved the problem of ucsc genome browser?
12:00 abooz joined #bioperl
13:00 * JunY yawns
13:30 * faceface screams
14:22 spekki01 joined #bioperl
16:38 perl_splut joined #bioperl
16:52 pyrimidine joined #bioperl
16:53 cawiss joined #bioperl
17:28 CIA-42 Bio-FeatureIO: Chris Fields master * ra951334 / .shipit : add default .shipit file - http://bit.ly/90Qjbe
17:35 pyrimidine we're being watched by the CIA...
17:35 pyrimidine :)
17:35 perl_splut anyone played with Bio::Biblio?
17:36 pyrimidine no
17:36 pyrimidine haven't really looked at it
17:37 pyrimidine it's one of the several sections of bioperl we're thinking of releasing as a separate component
17:37 perl_splut Trying to come up with a way to automate the discovery of new articles by certain authors to populate a site
17:38 pyrimidine there was talk a while back of having it replace some of the Bio::Annotation::Reference stuff, but IMO I think it's too heavy
17:47 CIA-42 Bio-FeatureIO: Chris Fields master * ra5c1a15 / (t/data/knownGene.gff3 t/data/knownGene2.gff3): add test files from branch back to master - http://bit.ly/a9S0vV
17:47 CIA-42 Bio-FeatureIO: Chris Fields master * r3d09f73 / (6 files in 2 dirs): pull split tests over from refactor branch - http://bit.ly/9WintE
18:06 dnewkirk joined #bioperl
18:09 batta joined #bioperl
18:13 batta left #bioperl
18:19 dnewkirk_lab joined #bioperl
18:20 pyrimidine Um, just wow.  Uncovered a MAJOR bug in FeatureIO.  Not sure how anyone found this module of any use whatsoever.
18:22 CIA-42 bioperl-live: Chris Fields master * rb991c24 / t/SeqFeature/FeatureIO.t : bug in FeatureIO, TODO demonstrates the problem - http://bit.ly/dbBuFD
18:22 perl_splut what bug is that?
18:23 pyrimidine when parsing gff3, if it encounters a FASTA-only file
18:23 pyrimidine it takes the first line of sequence as the ID, rest as sequence
18:24 pyrimidine http://gist.github.com/404061 (trimmed down from t/SeqFeature/FeatureIO.t)
18:34 perl_splut isn't that what FASTA is?
18:36 pyrimidine No.  It is skipping the descriptor line altogether
18:36 dnewkirk_lab oops
18:36 pyrimidine #   Failed (TODO) test at t/SeqFeature/FeatureIO.t line 43.
18:36 pyrimidine #          got: 'TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTT​AGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATA​AAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC'
18:36 pyrimidine #     expected: 'Test1'
18:36 pyrimidine EPICFAIL
18:37 rbuels pyrimidine: oh i see, if you try to read just a bare fasta file with it, it doesn't work
18:37 pyrimidine yep
18:37 rbuels pyrimidine: probably that case was never tested, since the common case is having fasta appended after some features
18:37 * rbuels is not surprised
18:37 pyrimidine yes, but it needs to support that
18:38 pyrimidine the parser should be blind to the difference
18:39 rbuels yes of course, i agree that it's a bug, lol
18:40 * pyrimidine checking embedded fasta
18:41 rbuels pyrimidine: i was also thinking the other day that going forward, bioperl parsers should try to separate the two tasks of a.) parsing and b.) making bioperl data model (i.e. SeqFeatureI, SeqI) objects out of the parsed data
18:41 rbuels because it seems to me that there are many tasks for which one needs a parser, but not the data model
18:41 rbuels and there are also many tasks for which you want the data model too
18:41 pyrimidine agreed.  Actually have started down that path already
18:42 pyrimidine have a next_dataset that returns hash-refs of data, passes it into a handler for doing ... whatever
18:43 * pyrimidine points to Bio::SeqIO::gbdriver and ilk
18:43 pyrimidine though the next_dataset in that case needs to be abstracted out
18:44 pyrimidine here's the problem....
18:44 pyrimidine if we do this (parse out chunks, or go event-based)
18:44 pyrimidine we need to come up with a standardized way of doing it
18:44 * rbuels votes against event-based
18:45 * pyrimidine same here
18:45 pyrimidine chunks of related data is the best way
18:45 pyrimidine for seq records, annotation/features/sequence
18:45 rbuels the problem is 'how high-level are the chunks'?
18:46 pyrimidine precisely
18:46 rbuels e.g. whole sequence?  genes + subfeatures?
18:46 pyrimidine not whole sequence
18:46 rbuels yeah, that's sort of the canonical question with pull parsers
18:46 pyrimidine a stream of simple related data
18:46 rbuels guy in pdx.pm was talking about this w.r.t xml parsing
18:47 pyrimidine that's the model I'm thinking of
18:47 rbuels he sort of said that it comes down to being able to configure the pull parser
18:47 rbuels to get what you want
18:47 pyrimidine yep
18:47 rbuels there probably is not a way to standardize that across all formats.
18:48 pyrimidine I think we can do pretty well with the ones we already have
18:49 rbuels just refactoring a little to separate the datamodel construction from the parsing
18:49 rbuels yeah, probably.
18:49 pyrimidine It could even be done lazily, pull-parser style
18:50 rbuels well, lazy data model construction is .... kind of thorny
18:51 rbuels the data model objects at that point need to be able to sort of have intermediate states right?
18:51 rbuels not-really-constructed ...
18:51 rbuels a-little-more-constructed....
18:51 rbuels fully-constructed ....
18:51 rbuels or something?
18:51 pyrimidine i was thinking of something like, when parsing through a stream of data, if you know specific points of 'interest' (start of features, annotation, etc)...
18:51 pyrimidine spawn a child IO, with the fh, that retains the location in the stream
18:52 pyrimidine the parent parser proceeds on, while the 'child' is at that point of interest
18:53 pyrimidine when called, using something like get_Annotations
18:53 rbuels oh i see
18:53 pyrimidine use a generic parser to parse out the chunks
18:53 rbuels what you're actually doing is indexing.
18:53 pyrimidine sort of
18:53 spekki01 joined #bioperl
18:53 pyrimidine indexing at a very high level
18:54 pyrimidine not every annotation/feature/sequence
18:54 rbuels the thing that's being saved is the offset in the file of the start of the sub-things
18:54 pyrimidine but the start/end of those
18:54 pyrimidine yes
18:54 rbuels really.
18:54 pyrimidine so very little is retained in memory
18:54 pyrimidine where it gets tricky
18:55 pyrimidine things like species info
18:55 rbuels so parsers need to be really good at skipping things as well as parsing things.  it does increase the complexity of the parser.
18:55 pyrimidine yes, but it's possible
18:55 pyrimidine already have a protoype parser
18:55 pyrimidine http://github.com/cjfields/Bio-Stream
18:55 pyrimidine but it needs a LOT of work
18:56 rbuels how is species info particularly tricky
18:56 pyrimidine well, we sort of fake a buffer in Bio::Root::IO, right?
18:56 pyrimidine could probably work in discontinuous reads
18:57 pyrimidine just need to think how to set that up
18:57 pyrimidine The other thing: this could (in essence) replace all the Bio::Index and related modules
18:58 pyrimidine (we would know record positions in a file)
18:59 pyrimidine afk * # meeting
18:59 rbuels ok
19:08 pyrimidine ok, back
19:08 pyrimidine anyone know of any IO::* modules that would simulate the above?
19:10 rbuels which above
19:10 rbuels pyrimidine: ^^
19:10 pyrimidine what we were discussing
19:11 rbuels i don't understand, simulate what part?
19:11 pyrimidine something that acts as if you are calling readline, but can do it discontinuously if given (for instance) file points
19:11 rbuels (and for what purpose)?
19:11 rbuels oh.
19:11 rbuels that's a 2-line subroutine ...
19:12 rbuels sub read_from { my ($fh, $where) = @_;  $fh->seek($where); $fh->readline }
19:12 rbuels ?
19:13 pyrimidine I'm thinking of something more like, if given start1/end2 and start2/end2 blocks
19:13 pyrimidine or more
19:13 pyrimidine read as if it's one file
19:13 pyrimidine skipping over the intervening bits
19:13 pyrimidine (ie the stuff between end1 and start2)
19:14 rbuels oh i see
19:14 pyrimidine that's a little trickier
19:14 pyrimidine but doable
19:14 pyrimidine (I think)
19:15 rbuels i don't think you'd need to do that for these bio modules ....
19:15 rbuels the file access schemes would probably be different from that ...
19:16 rbuels i dunno
19:16 rbuels i need to go soon ....
19:16 pyrimidine in genbank, for something like species, where the data is in annotation and in features, might be necessary to have this around
19:16 pyrimidine *source features
19:16 pyrimidine but there is a way around that, I think, so no worries
19:18 spekki01 joined #bioperl
19:18 rbuels pyrimidine: regarding, branches, i was going to make those moves today
19:18 rbuels pyrimidine: where is a good place in the wiki to put the recovery scheem
19:18 abooz hi guys, this small bug i sorted in Bio::FeatureIO::gff, how can i push on github or wherever to have it tested?Can i just send the patch to mailing list?
19:18 rbuels s/em/me/
19:18 pyrimidine rbuels: go for it, when you have time
19:18 rbuels abooz: you can send a pull request on github
19:19 rbuels abooz: or you can send a patch to the mailing list
19:19 rbuels abooz: or if the bug is in bugzilla (it is, isn't it), you can put the patch in there
19:19 pyrimidine funny, a bug in Bio::FeatureIO::gff.  imagine that.
19:19 * rbuels has never heard of one
19:19 pyrimidine :-D
19:20 abooz cool.
19:20 pyrimidine abooz: send in a pull request via github.  I can look it over.
19:20 abooz my git skills are non existent. I'll try the bugzilla route :)
19:21 spekki01 joined #bioperl
19:21 pyrimidine abooz: if you do that, only select 'bioperl' as the person to send a request to
19:21 rbuels pyrimidine: i'm going to attach the archived-branch-recovery instructions on http://www.bioperl.org/wiki/Using_Git, sound ok>
19:21 rbuels ?
19:21 pyrimidine rbuels: ok
19:22 pyrimidine re: pull requests, if you are a collab and have a default email, unless you have personal setting turned off to prevent pull requests, you will get two messages (one from bioperl-guts-l, one directly)
19:23 pyrimidine github is supposed to fix this at some point
19:23 * rbuels has noticed that, lol
19:23 pyrimidine the default when posting a pull request is that all collabs get one, which is kinda silly for groups
19:48 pyrimidine rbuels: can you check to make sure the tagged downloads still work after you do the branch migration?
19:49 rbuels pyrimidine: sure i'll check
19:49 pyrimidine k
19:49 rbuels pyrimidine: shouldn't affect it though .... (unless i nuke the tags accidentally lol)
19:50 pyrimidine you can always mirror (git clone --mirror) prior to the changes jic
19:50 pyrimidine I'm not really worried
19:51 rbuels ;-)
19:52 rbuels stand aside sir, i'm a professional live-database-nuker
19:53 * pyrimidine shocked and awed
20:33 CIA-42 bioperl-live: Chris Fields master * r38dbd4c / t/SeqFeature/FeatureIO.t : minor tweak to tests - http://bit.ly/9i2mPE
20:33 CIA-42 bioperl-live: Chris Fields master * r54333ea / t/SeqFeature/FeatureIO.t : featureio bug extends to any gff3 or raw fasta - http://bit.ly/cs96En
20:46 wilywonka_ joined #bioperl
20:48 CIA-42 bioperl-live: Adam Sjøgren master * r859408b / Bio/SeqIO/abi.pm : (log message trimmed)
20:48 CIA-42 bioperl-live: Update pod and make get_trace_data() return the current value.
20:48 CIA-42 bioperl-live: The pod references the option -read_graph_data and the method
20:48 CIA-42 bioperl-live: read_graph_data(), but neither are handled by the code; the code
20:48 CIA-42 bioperl-live: uses get_trace_data.
20:48 CIA-42 bioperl-live: The method get_trace_data() is used as an accessor in the code:
20:48 CIA-42 bioperl-live: called without an argument to read the value - but the method
20:53 deafferret wow. quite a backlog today
20:54 deafferret JunY: nope. if you know how to make UCSC display an arrow instead of a bar from .gff, do tell
20:59 deafferret abooz: did you get your patch in?
20:59 deafferret rbuels: did you move the old branches?
21:00 rbuels deafferret: not yet
21:00 abooz deafferret: nope.
21:00 deafferret slackers. all of you
21:00 CIA-42 bioperl-live: Chris Fields master * r5fcfaae / t/Assembly/Assembly.t : make assembly tests a little easier re: number of tests - http://bit.ly/a3X7js
21:00 CIA-42 bioperl-live: Chris Fields master * r2831245 / Bio/SeqIO/abi.pm : Merge branch 'master' of git@github.com:bioperl/bioperl-live - http://bit.ly/d0iZMV
21:00 rbuels deafferret: nobody works when you're not around.
21:00 deafferret see? pyrimidine is busy
21:00 pyrimidine shh, the CIA is watching...
21:00 rbuels deafferret: only cause you're here.
21:00 bag_ joined #bioperl
21:00 deafferret abooz: if you need help I can try  :)
21:01 * deafferret does not fear the Culinary Istitute of America
21:01 deafferret n
21:02 pyrimidine damn the CIA and their coque au vin
21:03 pyrimidine *coq
21:05 rbuels actually i thought it was quite tasty
21:07 rbuels the method of braising before simmering is a trifle fussy, though.
21:08 rbuels i'm an all-stovetop or all-oven man.
21:08 rbuels no in-between.  put on the daddy pants.
21:09 CIA-42 bioperl-live: Chris Fields master * rc86c048 / t/Assembly/Assembly.t : add some file cleanup from samtools testing - http://bit.ly/bpWtcY
21:10 pyrimidine If I ate stovetop too much I couldn't put on the daddy pants
21:10 pyrimidine unless I was a big daddy
21:12 * rbuels wonders about the size of deafferret's pants
21:22 pyrimidine signing out for the day (wife's picking me up)
21:22 rbuels ok see you
21:22 pyrimidine o/
21:22 pyrimidine left #bioperl
21:32 wilywonka_ joined #bioperl
22:24 wilywonka_ joined #bioperl
23:24 CIA-42 bioperl-live: nobody refs/branches/rob_test_branch * rb7af73a / :
23:24 CIA-42 bioperl-live: This commit was manufactured by cvs2svn to create branch
23:24 CIA-42 bioperl-live: 'branch-ensembl-m1'.
23:24 CIA-42 bioperl-live: svn path=/bioperl-live/branches/branch-ensembl-m1/; revision=1140 - http://bit.ly/9maO01
23:26 CIA-42 bioperl-live: nobody rob_test_branch * rb7af73a / :
23:26 CIA-42 bioperl-live: This commit was manufactured by cvs2svn to create branch
23:26 CIA-42 bioperl-live: 'branch-ensembl-m1'.
23:26 CIA-42 bioperl-live: svn path=/bioperl-live/branches/branch-ensembl-m1/; revision=1140 - http://bit.ly/9maO01
23:32 rbuels hrm.  the tractability of git to standard commit message handling is .... limited.
23:33 rbuels what i actually just did is shuffle a branch head to and from archives as a test ....
23:33 rbuels but CIA just printed the commit message of that branch head twice.
23:33 rbuels so much for central intelligence.
23:47 deafferret wow. Mondays are always insane  :(
23:48 deafferret rbuels: what's with all the "cvs2svn" stuff?
23:48 rbuels deafferret: those are the old old commits
23:48 rbuels from the cvs -> svn conversion
23:48 * deafferret starts beating the git2rcs war drums
23:49 deafferret are branch renames commits?
23:56 deafferret rbuels++ # wiki god
23:56 rbuels deafferret: no they're not commits
23:58 deafferret are they versioned? (can they be reversed?)

| Channels | #bioperl index | Today | | Search | Google Search | Plain-Text | summary