Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2015-01-30

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:42 jar286 joined #opentreeoflife
06:35 77CAAIBD3 joined #opentreeoflife
12:08 mtholder joined #opentreeoflife
13:46 jar286 joined #opentreeoflife
15:09 jar286 joined #opentreeoflife
15:10 jar286 joined #opentreeoflife
15:56 mtholder jar286, is there an easy way to produce a subset of OTT (like the Asterales subset) ?
15:56 mtholder i.e. is there a tool that I can give an OTT ID and get the files for its descendants?
16:03 jar286 hi
16:04 jar286 hmm.  look at the Makefile
16:04 jar286 I think it’s called ‘select’
16:04 jar286 can be used from jython
16:04 jar286 ^ mtholder
16:34 josephwb mtholder there are subset scripts in treemachine
16:34 josephwb python scripts, i mean
16:34 josephwb in the treemachine repo
16:34 josephwb doesn't deal with flags
16:35 josephwb give it a taxon, returns taxonomy below that
16:35 josephwb https://github.com/OpenTreeOfLife/treemachine/blob/master/scripts/subset_taxonomy.py
16:35 josephwb and he is not even here...
16:35 josephwb d'oh
16:48 mtholder joined #opentreeoflife
16:50 jar286 joined #opentreeoflife
16:53 mtholder thanks josephwb for the pointer in the email.
17:56 mtholder joined #opentreeoflife
18:08 jar286 joined #opentreeoflife
21:01 josephwb you there mtholder?
21:01 mtholder yes.
21:01 mtholder what's up?
21:02 josephwb can i pick yer brain about floating point equality checking in C++11?
21:02 mtholder sure.
21:03 josephwb um, how do you pull that off?
21:03 mtholder "abs(x - y) < TOL" for some tolerance TOL is the usual way
21:03 josephwb i have tried absolute and relative "epsilon" checks.
21:03 mtholder or is it fabs
21:03 josephwb yes, that
21:03 josephwb but what for TOL?
21:03 josephwb this fails: numeric_limits<double>::epsilon()
21:04 mtholder really depends on the calculation, I'm afraid. If you have an algorithm that loses lots of precision, then 1E-8 or something
21:04 josephwb my EPSILON == your TOL
21:04 mtholder like that.
21:04 josephwb ok. the epsilon i noted above = 2.22045e-16
21:05 mtholder In likelihoods in phylogenetics, the exponentiating of the pmat has lots of issues with numerical precision.
21:05 mtholder so usually expect E-10 for likelihood comparisons.
21:05 josephwb i am just dealing with edge lengths.
21:05 josephwb error creeps in for, e.g., tip-to-root path lengths
21:06 josephwb the analogous float epsilon = 1.19209e-07
21:06 josephwb that works (for now), but since I am using doubles...
21:06 mtholder what is numeric_limits round_error()
21:06 mtholder ?
21:07 mtholder http://www.cplusplus.com/reference/limits/numeric_limits/
21:08 josephwb 0.5?
21:08 mtholder let me check...
21:09 josephwb that's what i got
21:11 josephwb trying to check ultrametricity; you can see how error can enter in.
21:11 mtholder yeah. you're probably limited by precision of the branch lengths in the files that you read in (if you are reading branch lengths).
21:12 josephwb yes
21:12 josephwb probably float (from R for testing)
21:14 josephwb i check against:
21:14 josephwb max(EPSILON, EPSILON * max(abs(a), abs(b)))
21:14 josephwb should handle scale
21:14 josephwb i can just use the float epsilon for now.
21:14 mtholder so std::numeric_limits<double>::digits10() tells you how many digits you have in the mantissa.
21:14 josephwb ooh.
21:14 mtholder but if you are reading doubles from a file...
21:15 mtholder then you could be off by as much as .5 * the place of the last digit
21:15 josephwb ((((s1:0.3603553431,s2:0.3603553431):0.8968782862,....
21:15 josephwb ah
21:15 mtholder so that could be much larger than the numerical issues (when you count the number of additions that you have to do).
21:15 josephwb exactly
21:16 mtholder 0.3603553431 could be 0.36035534305 or 0.36035534315
21:16 mtholder so E-11 off for that number.
21:17 mtholder times the number of additions should give you a decent TOL or EPSILON.
21:17 josephwb have you played with ULPs? seems overkill. https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
21:18 mtholder no I haven't looked at that.
21:18 josephwb ok, i've bugged you enough. thanks.
21:19 mtholder good luck. these things are always a pain.
21:19 josephwb werd.
21:20 josephwb i hope we get the treemachine issues figured out. seems like we are not sure which ones are actual issues at this point?
21:20 mtholder I think that all 111 in the new issue are legit.
21:20 mtholder but I just wrote that checker this week - so it is certainly possible that it has bugs.
21:21 josephwb ok.
21:21 mtholder do y'all agree that every edge in the synth should be supported by at least one input?
21:21 josephwb yes!
21:21 mtholder I'm not sure if I'm just not understanding the synthesis procedure.
21:21 josephwb minimally taxonomy.
21:22 mtholder that is definitely not something that all supertree methods promise.
21:22 josephwb i am the one who wrote in the response letter "no novel relationships can arise..."
21:22 mtholder but I think that it should be true of TAG
21:23 josephwb "novel" meaning "not present in any source"
21:23 mtholder I still think that https://devtree.opentreeoflife.org/opentree/otol.draft.22@3846500/Asterolinon-adoense--Anagallis-monelli
21:23 mtholder is going be the easist to debug
21:23 mtholder "novel" in my output means not present in Ruchi's list.
21:24 mtholder my software calls an edge unsupported if you could collapse it  and not increase its RF distance to any of the inputs.
21:24 josephwb i saw that
21:24 mtholder I was sloppy when I stated initially defined it in https://github.com/OpenTreeOfLife/treemachine/issues/156
21:25 mtholder I should update that original post to avoid more confusion...
21:25 josephwb did you do a synthesis?
21:25 josephwb there is a way to look at the graph in a browser
21:25 mtholder no I got distracted with other stuff today.
21:26 mtholder I was going to extract just Primulaceae and see if the weirdness for https://devtree.opentreeoflife.org/opentree/argus/otol.draft.22@3846500/Asterolinon-adoense--Anagallis-monelli
21:26 mtholder persists.
21:26 josephwb compile server plugins, run neo4j, http://localhost:7474/webadmin/
21:26 josephwb yes, a small example would be easier
21:35 josephwb i'll look at it further, but nothing immediately pops out.
21:35 mtholder The only two trees that are involved in that case are pg_2661_6198.tre pg_50_1397.tre
21:36 josephwb and taxonomy
21:36 mtholder yeah. oddly it looks like taxonomy is winning.
21:36 josephwb hmm.
21:37 mtholder at least in terms of the placement of Lysimachia
21:37 mtholder see https://devtree.opentreeoflife.org/opentree/otol.draft.22@3382292/Primulaceae
21:39 mtholder the other small case (though it involves multiple source trees) is:
21:39 mtholder https://tree.opentreeoflife.org/opentree/argus/otol.draft.22@3869643
21:40 josephwb ok
21:40 mtholder (note all the odd placement of Lyngbya sp. BAN TS31 but the genus labels Lyngbya being applied to the sister clade.
21:41 mtholder Also note that in the earlier case pg_2661_6198.tre and pg_50_1397.tre are the only trees with Lysimachia
21:41 josephwb yikes.
21:41 mtholder I didn't mean to imply that they were the only relevant to Primulaceae
21:41 josephwb think i'll look at the 1st one.
21:43 josephwb Lyngbya sp. BAN TS31 almost looks like a synonym issue. doesn't seem to be, though.
21:43 josephwb has synonymns, but just ncbi ids
21:44 josephwb wait a sec...
21:45 josephwb the Lyngbya sp. BAN TS* species have different parents in the taxonomy
21:46 josephwb found it
21:47 josephwb Lyngbya sp. BAN TS31 has Trichodesmium as its parent in the taxonomy!
21:47 josephwb Lyngbya sp. BAN TS30 has Lyngbya as its parent
21:47 josephwb you still there mtholder?
21:47 mtholder Ah. OK. that explains the name. I think that no tree support Lyngbya + Trichodesmium as a clade though
21:48 mtholder the Lyngbya weirdness is not on the list of mis-named nodes at https://github.com/OpenTreeOfLife/treemachine/issues/154
21:48 mtholder sorry if i got us side-tracked
21:49 mtholder forgot to read my own comment on the issue...
21:49 mtholder it says:
21:49 mtholder The README.txt in https://www.dropbox.com/sh/lnmcr208fk9gr1v/AADJTANQD1ykfg7Ud0Q6Lux9a?dl=0
21:49 mtholder is my notes on some additional (partially manual) follow up tests that make me think that the MRCA of "Trichodesmium ott196997" and "Lyngbya ott878838" in the synthetic tree is not supported by any of the 4 sources (pg_2542_5590, pg_2554_5580, pg_2739_6601, pg_2891_6699, or the taxonomy).
21:50 mtholder all of the nodes in the taxonomy have names, right?
21:50 josephwb yes
21:58 josephwb the pg_2542_5590 tree is a montsre
21:58 josephwb monster
21:59 mtholder yes. that is the problem with this project. everything is too damned big.
22:00 josephwb i checked, and the tree is rooted, so that is not the problem.
22:06 josephwb i'll pick this up later tonight.
22:08 mtholder ok. good night...
22:08 pmidford2 joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary