Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2014-05-29

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:22 ilbot3 joined #opentreeoflife
00:22 Topic for #opentreeoflife is now Open Tree Of Life | opentreeoflife.org | github.com/opentreeoflife | http://irclog.perlgeek.de/opentreeoflife/today
00:45 towodo joined #opentreeoflife
00:52 josephwb joined #opentreeoflife
02:08 josephwb joined #opentreeoflife
04:03 josephwb joined #opentreeoflife
10:53 josephwb joined #opentreeoflife
11:14 josephwb joined #opentreeoflife
11:35 towodo joined #opentreeoflife
11:55 josephwb joined #opentreeoflife
11:59 josephwb joined #opentreeoflife
12:16 josephwb joined #opentreeoflife
12:29 josephwb joined #opentreeoflife
12:46 josephwb joined #opentreeoflife
12:57 josephwb joined #opentreeoflife
13:34 lcoghill joined #opentreeoflife
13:54 josephwb joined #opentreeoflife
14:02 lcoghill joined #opentreeoflife
15:09 kcranstn joined #opentreeoflife
15:37 mtholder joined #opentreeoflife
15:56 mtholder jimallman: is it OK w/ you if I take the phylesystem-api on ot7 offline for a bit?
15:57 jimallman mtholder: no problem. i’m running a local API at the moment (trying to untangle the new OTU-labeling properties)
15:57 mtholder cool. let me know if you need help with that.
15:57 jimallman thanks. i’m reviewing emails now, retracing our steps :)
15:58 kcranstn on a similar note, is the nexson documentation on the wiki up to date?
15:58 kcranstn (putting together data submission for manuscript)
16:00 mtholder i think that phylesystem, curator app, oti, and phylografter are speaking slightly different variants of nexson. what is on the wiki is probably a 5th variant.
16:01 mtholder So. "no"
16:01 mtholder but not too far off.
16:01 kcranstn the great thing about standards is that there are just so many to choose from… ;)
16:01 mtholder true.
16:02 * jimallman had to step away, catching up now…
16:02 mtholder jimallman: I think that https://github.com/OpenTreeOfLife/phylesystem-0/blob/master/study/pg_94/pg_94/pg_94.json shows the 3 label fields that peyotl thinks are valid
16:03 mtholder check that. it has ^ot:altLabel ^ot:originalLabel
16:03 mtholder but not
16:03 mtholder ^ot:ottTaxonName
16:06 21WAAN4TK joined #opentreeoflife
16:07 jimallman hm, I was not expecting altLabel… will keep reading and ask if I get stuck.
16:08 14WAC1S61 joined #opentreeoflife
16:08 mtholder I think that is just for the partially altered label - safe to ignore if you have another spot for holding user edits to the label
16:08 mtholder "partially altered" ?
16:08 mtholder altered
16:09 jimallman so i could use this to store manual edits to the imported label, as “hints” during OTU mapping?
16:09 mtholder yes
16:09 mtholder but they are rare in our files (and many of them appear to be cruft)
16:09 mtholder so I would not add "support for altLabel" as a high priority feature.
16:10 jimallman has ottTaxonName been deprecated? if so, why? (if this is documented somewhere, just point the way..)
16:10 mtholder no it has not.
16:10 mtholder I meant that that file didn't have any
16:10 jimallman got it, thanks.
16:11 jimallman Should we retain altLabel even if it’s not “interesting” (for example, if it matches the mapped ottTaxonName)? it seems like we could just toss it in this case..
16:12 jimallman mtholder, one more question: is altLabel different from manualLabelFromCuration (from phylografter)?
16:12 mtholder I think that the import tosses it if is equal to orig lable or ott taxon name.
16:13 mtholder hmm I'll look up the manualLabelFromCuration...
16:13 jimallman OK, then i’ll do the same (toss boring altLabel) when scrubbing Nexson (before saves) in the curation app.
16:16 mtholder jimallman: I'm not seeing manualLabelFromCuration in the wiki or in the current corpus. is it in an old version of some of the files?
16:16 jimallman i’m walking through emails from March… it was mentioned there, but maybe dropped or renamed.
16:16 mtholder I think that it is basically altLabel.
16:16 mtholder we got altLabel from treebase
16:16 mtholder not all of them are "manual"
16:17 jimallman sounds right. apparently it was proposed to support manual label editing in phylografter
16:17 mtholder so perhaps that is why we preferred it to manualLabelFromCuration
16:17 jimallman yeah, i’m reading that discussion now (retain multiple altLabel values? include the source for each?)
16:18 jimallman i’m not sure how we’d handle multiple altLabel values in OTU mapping. i can see how it might be beneficial, but it makes for some freaky UI changes.
16:18 mtholder I think that it is safe for you to ignore.
16:19 mtholder it seems spiteful to remove the extra ones coming in from treebase, so I'm inclined to leave them in, but I think the curator app could only pay attention to the first one (if present)
16:19 jimallman let’s try it simple for now. i’ll expect to find one (or no) altLabel values, and show it as the manually-edited label for OTU mapping purposes.
16:19 mtholder good
16:20 jimallman and yes, i’ll preserve multiple values if found, and show either the “manual” one or the first one found, i guess. if we come up a ranking of trust among sources, i could also apply that (ie, manual label if found, else TreeBASE, else phylografter, else none).
16:22 mtholder to be honest, it is even fine with me if the curator app culls the list down to one. If there are >1 in the import, then they'll be in the git history for the study.
16:23 jimallman hm. so other values aren’t lost forever, but recovering them would be a chore..
16:23 mtholder given that they aren't that useful, that is probably fine (if they are culled after the first commit). I think that if there is one that a curator is tweaking, it is nice to save it. But the second or third ones are probably just historical cruft.
16:24 jimallman i’d hate to throw out stuff that might be valuable to someone. let me see if i can preserve them and just add/modify a single ‘manual’ altLabel.
16:25 mtholder yeah. a chore, but unlikely to come up. from my perspective, an advantage of having the history is that we can be somewhat aggressive about keeping the files "clean" - by which chunking info that is pretty unlikely to be used by most tools.
16:25 jimallman true dat
16:26 jimallman for our purposes, mapping to an OTT taxon clearly trumps any other label.
16:27 mtholder yeah. and treebase is very willing to add altLabels of questionable value, so they are not really curated content.
16:27 jimallman …so it’s tempting to dump any altLabel value(s) after mapping an OTU, since they’re sort of moot. unless someone comes back to second-guess the mapping, then they might want all available hints..
16:27 mtholder i agree with that.
16:27 mtholder that  = they are moot.
16:28 kcranstn sounds good to me
16:28 jimallman which? ruthless purging, or keeping originalLabel and altLabel in case we return to change mapping?
16:28 mtholder "ruthless purging" is a decent band name.
16:29 jimallman noted!
16:31 jimallman …but the question stands. since there’s a possibility we’ll reconsider a bad OTU mapping, i’d like to keep the latest hints (original + alt labels) around
16:32 mtholder the latest one sounds good. But culling a list to just one is fine - no need to support the full list.
16:36 jimallman OK, that matches the wiki page as well (https://github.com/OpenTreeOfLife/phylesystem-api/wiki/NexSON). This describes ot:altLabel as a single string value that is purged on successful mapping.
16:37 mtholder what a pleasing coincidence.
16:37 mtholder ;-)
16:37 jimallman :) It works well for a simple scenario. If manual edits led to a faulty mapping, it’s probably no loss to dump them.
16:37 kcranstn yes, ruthless purging
16:43 jimallman mtholder:  i see that the multiply-sourced property is actually ^skos:altLabel.. i’ll purge this if found after reducing it to a single-string ot:altLabel.
16:45 mtholder is that in a phylesystem repo, or in the raw translation from a treebase import?
16:53 jimallman i haven’t spotted this in data yet, just in your email of March 21, Re: [OpenTree-software] follow-up on otu label, ottTaxonName and originalLabel discussion on today's "overtime" session of the call
16:53 jimallman lemme check the docstore…
16:54 jimallman OK, no instances of ‘skos’ found in phylesystem-0. so maybe these are already scrubbed out? or just very rare?
16:54 mtholder I think that was the inspiration for the altLabel, and we did the translation that you were suggesting (skos -> ot)
16:54 jimallman cool, thanks
16:56 jimallman mtholder: so that transiation (skos -> ot) is being done on initial import? or should i watch for and translate this in the curation app?
16:56 mtholder actually it looks like they were rare, but they could sneak in via treebase, now (see https://github.com/OpenTreeOfLife/peyotl/blob/master/peyotl/external.py
16:56 mtholder I can change that code to translate them to ot:
16:57 jimallman yes, and simplify to a single string, perhaps based on preferred source(s)
16:57 jimallman kewl
16:57 mtholder I'll take care of it in peyotl incase we want to ever use some cron job for importing a batch of files from treebase.
17:00 jimallman just to confirm: the curation app should ignore (maybe even scrub) any OTU @label properties, since we’re deprectating this in favor of our more precise properties. yes? i assume we’d restore @label if someone exports in a format that requires it…
17:01 mtholder yes. Let me know if you see any @label.  On export nexml, we'll have to choose one of the meta labels to use as @label, but we can use originalLabel for that.
17:01 mtholder jimallman: do you have any edits to studies in phylesystem-0 (testing repo) that you would be sad to see disappear? I was debating copying the files from phylesystem-1 to expunge some of the cruft.
17:01 jimallman right, so we’re covered.
17:01 jimallman nah, go nuts.
17:02 jimallman (ok to clobber phylesystem-0, i mean)
17:03 mtholder it has been clobbered. redeploying on ot10 now...
17:15 mtholder OK. all of the files in phylesystem-1 are in phylesystem-0 and all of those are valid except for pg_2918.json
17:17 kcranstn completely different topic: we’ve been talking about not getting brlens from treebase, but I see treebase files with brlens. Is this just inconsistent, or is phylografter stripping them out?
17:17 mtholder I thought they were in new tb studies but not old ones. not sure about that...
17:17 kcranstn ok, so publication date biased
17:18 kcranstn but if the treebase study has branch lenghts, then we have branch lenghts?
17:28 mtholder not sure. I just used grep to build a phylografter study ID -> treebase study ID map. Haven't checked it yet, but it is at http://phylo.bio.ku.edu/ot/phylografterStudy2TreeBaseStudy.json
17:29 lcoghill phylografter doesn't strip them. there are newick strings in the database with branch lengths. the stree table also has branch_lengths_represent value. Its just most studies don't have them.
17:42 jimallman yes, we’ve definitely seen branch lengths in the curation app, and i’m pretty sure in studies from pg
17:42 kcranstn ok, good to know
17:42 mtholder TreeBase S13888 and phylografter 2466 is one example
17:43 kcranstn I might take a look to see how many studies have brlens
17:47 mtholder 1043 out of 2854 studies in phylesystem-1 (which is up-to-date with pg as of this morning) have the @length property.
17:48 mtholder kcranston ^
17:48 kcranstn that’s not too shabby
17:49 mtholder not sure how many trees. that is just "studies for which at least one tree has a @length" (I don't think @length occurs in other contexts)
17:57 lcoghill Doing a quick sql count, pg has 1065 trees that have branch length values that we know what those values represent.
18:03 jimallman mtholder: fwiw, i don’t recall any other uses of @length in Nexson
18:26 mtholder jimallman: that skos:altLabel -> ot:altLabel stuff is now implemented and deployed.
18:27 jimallman awesome! i’m working on the client-side changes now (fairly wide-ranging), hope to have it done by end of today.
18:32 kcranstn fixes for the strange OTU mapping behaviour seen in the demo?
18:37 jimallman kcranstn: yes
18:37 * jimallman missed her, drat..
18:55 kcranstn joined #opentreeoflife
19:14 jimallman kcranstn: yes, i’m fixing the weird OTU mapping behavior from the demo.
20:17 kcranstn what’s the best way to delete study 1010 (the “please delete me” study)?
20:17 kcranstn oops 1019
20:17 kcranstn http://dev.opentreeoflife.org/curator/study/view/1019
20:30 josephwb joined #opentreeoflife
20:59 mtholder kcranston: not sure the history of that. It is in phylografter ( http://www.reelab.net/phylografter/study/view/1019 ) so we probably should not delete it from the curator app.
21:01 kcranstn joined #opentreeoflife
22:01 josephwb joined #opentreeoflife
22:04 towodo joined #opentreeoflife
22:06 towodo jimallman, any chance of getting an otu mapping screenshot for my talk tomorrow?  doesn't have to be study 2577
22:18 mtholder joined #opentreeoflife
22:21 mtholder joined #opentreeoflife
22:41 towodo joined #opentreeoflife
22:55 jimallman towodo: sorry, still working on this. BUT you should be able to get an accurate screenshot by clicking ‘Clear all visible mappings’, then proceeding as if there were no mapped OTUs.
22:56 jimallman towodo: i’ll try this quickly now and provide a few screenshots by email. sorry for the delay.
23:21 josephwb joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary