Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2015-02-04

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:11 jar286 joined #opentreeoflife
01:12 jar286 joined #opentreeoflife
01:26 jar286 joined #opentreeoflife
01:59 jar286 joined #opentreeoflife
02:19 jar286 joined #opentreeoflife
04:43 kcranstn joined #opentreeoflife
06:23 mtholder joined #opentreeoflife
12:36 josephwb wassup ilbot3?
12:52 mtholder joined #opentreeoflife
13:15 kcranstn joined #opentreeoflife
14:23 kcranstn joined #opentreeoflife
15:20 kcranstn joined #opentreeoflife
15:42 mtholder josephwb, Do I need to be calling any extra functions in my tests?
15:42 mtholder there is a mapcompat. Do I have to call that?
17:15 jimallman joined #opentreeoflife
17:29 mtholder joined #opentreeoflife
17:43 josephwb mtholder: yes
17:44 mtholder when do I have to run it?
17:44 mtholder josephwb^
17:45 josephwb let me check where we do it.
17:47 josephwb ingest tree, then mapcompat
17:47 josephwb prior to synthesis
17:47 josephwb is that what you mean?
17:47 mtholder yes
17:48 mtholder after all trees or ingested or after each one?
17:48 josephwb we have been doing it after each one
17:48 mtholder OK I'll try that.
17:49 josephwb look at "load_synth_extract.py" in gcmdr
17:52 josephwb this looks like it is done on a taxonomy-only DB
17:52 josephwb for really sparse trees, e.g. the paper placing turtles, had ~8 taxa from all vertebrates
17:52 josephwb tetrapods, rather
17:54 mtholder sorry, you mean I don't do it on the synthesis db?
17:54 josephwb nope
17:55 mtholder then what effect does it have?
17:55 josephwb make a db just for that process
17:56 josephwb for the turtle example: the study is not saying that one, say, mammal species is related to one bird species (or, whatever)
17:56 mtholder I was just trying to make sure that I was doing everything that I have to do before I synthesize in my tests
17:56 josephwb it walks back the taxonomy to find deeper mappings
17:56 josephwb i don't think you have to worry about it if the taxonomy is a polytomy
17:57 mtholder my tests call: inittax, addnewick, synthesize and then extractdrafttree
17:58 josephwb should be fine
17:58 josephwb addnewick is not used, so i am afraid it might be out of step (maybe)
17:59 josephwb maybe not
17:59 mtholder does the difference in node sharing described in https://github.com/OpenTreeOfLife/treemachine/issues/161 make sense to you?
17:59 josephwb i haven't seen that yet
17:59 josephwb just a sec
18:02 josephwb 1st figure looks like the one i drew
18:02 josephwb i am not familiar with tree3
18:03 mtholder in the second example tree3 is the same as tree2 from the previous example.
18:03 mtholder I added a different tree as tree2
18:03 mtholder and it seems to keep tree1 and tree3 from aligning to each other
18:03 kcranstn can you put the newick strings in the issue?
18:03 josephwb right. i will have to look at that one.
18:03 josephwb yes, that would be helpful
18:04 mtholder The inputs are in https://github.com/OpenTreeOfLife/treemachine/tree/test-refactor/test-synth/ranksforextras
18:04 kcranstn got it
18:08 kcranstn taxonomy is polytomy?
18:08 kcranstn yup, I see it
18:10 josephwb mtholder: mapcompat is done in synth DB
18:10 josephwb on tree ingest
18:11 josephwb sorry for the confusion
18:11 mtholder OK. thanks
18:11 josephwb if it is a polytompy, not relevant
18:13 josephwb what are we supposed to glean from that figure?
18:13 mtholder that the alignment in the TAG is...
18:13 mtholder I dunno
18:13 mtholder broken? hard to understand?
18:13 mtholder I'm not sure.
18:13 josephwb is this an order thing?
18:14 mtholder is it not surprising to you that n17 and n9 differ in the second figure
18:14 josephwb we used to read in all (creating nodes), delete the trees, and re-read them back in.
18:14 mtholder but are both n9 in the previous one
18:15 mtholder synthesis from node9 in the first case can find taxon E.
18:15 mtholder but in the second case it can't
18:15 mtholder (because of tree ranking this is OK in this case, but the difference in the TAG worries me)
18:16 josephwb [having trouble following tree and node ids...]
18:16 josephwb please bear with me
18:16 pmidford2 joined #opentreeoflife
18:16 mtholder yeah it is a lot of numbers/labels
18:16 mtholder I've stared at it for a while.
18:16 josephwb the tree rename is confusing!
18:17 mtholder yeah my tests use the number order to generate the ranking. So I couldn't move tree2 down in the list without changing its name...
18:18 josephwb we are all staring at it right now
18:19 mtholder black S edges are SYNTHCHILDOF
18:19 mtholder taxonomy is dashed
18:19 mtholder dashed red
18:20 mtholder the other colors, unfortunately, change in my dot export code
18:21 kcranstn ok, I see the n9 and n17 issue
18:21 kcranstn I would expect mrca(tree3,(A,E,B)) and mrca(tree1,(A,B)) to use the same node
18:22 mtholder right. that is what they do in the other example.
18:31 kcranstn joined #opentreeoflife
18:31 kcranstn sorry - was staring at whiteboard
18:31 kcranstn back now
18:41 kcranstn anyone still there?
18:42 mtholder I'm here.
18:46 josephwb discussing junk here. might be a while. cody has the dry erase marker.
18:50 kcranstn :)
19:01 jar286 joined #opentreeoflife
19:56 josephwb mtholder: stephen and cody will bve contacting you
19:56 mtholder ok
19:56 kcranstn that sounds ominous
19:58 codiferous joined #opentreeoflife
19:58 josephwb treemachine has a function "graphReloadTrees" that will remove all trees from the graph, and reload them in, potentially finding better mappings.
19:58 josephwb should fix ingest ordering problems.
20:00 codiferous identifying node mappings is order dependent. the way to get around it is remap tree in the context of all the other trees once they've all been added. you have to repeat the entire procedure until no new mappings are found
20:00 josephwb another function is "mapcompat". unfortunately, it takes individual sources as arguments. but should be easy to reprocess all.
20:00 codiferous currently that is not being done for synthesis because it didn't seem to be having much/any effect and it takes a long time
20:01 codiferous the number of compatible mappings increases with conflict and partially overlapping sampling
20:02 codiferous for the example in question, both mrca(tree3,(A,E,B)) and mrca(tree1,(A,B)) should be mapped to both n17 and n9
20:03 mtholder how can I tell if I need to issue on of these commands?
20:04 josephwb ?
20:04 codiferous afaik, the only way to know if there are yet-to-be-found mappings is to reprocess all the trees
20:05 mtholder I meant the "until no new mappings are found"
20:05 mtholder is remapping once enough?
20:05 mtholder if not, how will I know?
20:05 codiferous right, yeah we never went as far as providing that output
20:06 codiferous currently, i don't think it tells you if it found a new mapping
20:07 mtholder so lots of times.. OK but pgdelind as in the email or mapcompat mentioned here ? or both?
20:09 codiferous well, actually it is pretty easy to add
20:09 josephwb no, use: graphReloadTrees
20:10 codiferous i wasn't quite right. you just have to keep reprocessing the trees until no new nodes are added
20:10 josephwb does them all
20:10 mtholder The "reprocess" command? josephwb
20:11 josephwb yerp
20:11 josephwb doesn't state if new nodes are created, but easy to add.
20:12 codiferous also, need to correct myself: in the example, trees 1 and 3 should be mapped to node 17, but only tree 1 is mapped to node 9 (because mapping tree 2 to node 9 makes tree 3 incompatible, which is why 17 is created)
20:12 codiferous tree 1 isn't mapped to 17 because 17 doesn't exist when tree 1 is added.
20:13 mtholder hmm reprocess says:
20:13 mtholder Exception in thread "main" org.neo4j.graphdb.NotFoundException: 'newick' property not found for NodeImpl#2.
20:13 mtholder at org.neo4j.kernel.impl.core.Primitive.newPropertyNotFoundException(Primitive.java:193)
20:13 mtholder at org.neo4j.kernel.impl.core.Primitive.getProperty(Primitive.java:188)
20:13 mtholder at org.neo4j.kernel.impl.core.NodeImpl.getProperty(NodeImpl.java:53)
20:13 mtholder at org.neo4j.kernel.impl.core.NodeProxy.getProperty(NodeProxy.java:155)
20:13 mtholder at opentree.GraphImporter.deleteAllTreesAndReprocess(GraphImporter.java:1028)
20:13 mtholder at opentree.MainRunner.graphReloadTrees(MainRunner.java:495)
20:13 mtholder at opentree.MainRunner.main(MainRunner.java:3161)
20:14 josephwb hmm. let me see. hasn't been used in a while.
20:14 mtholder I'll guess that is not it...
20:14 kcranstn it is feeling like there is a ton of missing documentation....
20:15 codiferous ha! indeed
20:15 josephwb no, just pilot code
20:15 josephwb i guess the "newick" property has changed (or been taken out)
20:15 mtholder what does gcmdr do?
20:15 josephwb gcmdr uses nexsons
20:16 mtholder so I wouldn't need to do this if I convert my newicks to nexsons?
20:16 josephwb hmm, i see "newick" in your graph...
20:16 josephwb this seems not too difficult
20:24 codiferous mark, just to make sure i am understanding correctly what your fixes do...
20:25 codiferous at a node, you still compare just the set of sibling relationships to one another, but you are not walking down the graph toward the tips and then back up, is that correct?
20:29 mtholder the subsumes stuff?
20:29 mtholder codiferous^
20:29 josephwb he is away at the moment
20:29 mtholder assuming that is what he was referring to...
20:29 josephwb at the whiteboard
20:30 mtholder now the resolver adds the highest rank relationships
20:31 mtholder if another edge conflicts with only one added relationship and it "subsumes it" (it is possible to reach the startnode of the first rel via the later conflicting one)
20:31 mtholder then the later rel replaces the former.
20:31 mtholder so it walks towards the tips looking for the startnode of the rel that it might "subsume"
20:31 mtholder btw
20:31 mtholder 4 rounds of pgdelind and addnewick helped.
20:42 mtholder from looking at the test-synth/ranksforextras/out/synthesize.log it still looks like the logic is not correct and we are passing ranksforextras by luck
20:49 josephwb mtholder: found the process problem. solution imminent.
20:49 mtholder thanks.
20:50 josephwb "taxonomy" recorded as a tree for some reason...
20:52 pmidford2 joined #opentreeoflife
21:00 josephwb mtholder: sent fixed code for "deleteAllTreesAndReprocess", penultimate function in GraphImporter
21:01 josephwb sent via email
21:01 josephwb gmail
21:01 josephwb gmail email
21:01 kcranstn via email?
21:01 mtholder OK. I probably won't be able to get to it tonight. I think the pgdelind then addnewick 4 times worked.
21:06 josephwb is it still creating nodes after 3 times?
21:07 josephwb mtholder ^
21:07 mtholder Oh. I didn't know how to check.
21:07 mtholder just looked at the graph after all iterations.
21:08 josephwb not in there, but i have a "verbose" version that i can push somewhere
21:08 josephwb in function "postOrderAddProcessedTreeToGraph", the last "else" is when a new node is created.
21:09 josephwb add:
21:09 josephwb System.out.println("Creating new graph node: " + newLicaNode);
21:09 josephwb you can do something fancier, no doubt
21:16 kcranstn joined #opentreeoflife
21:30 mtholder thanks, josephwb.  I'm signing off.
21:33 kcranstn joined #opentreeoflife
22:09 codiferous joined #opentreeoflife
22:11 codiferous hey mark, sorry, yes. that was the question
22:13 codiferous but there is an additional issue that is challenging to solve when proceeding root -> tips, which is that though you may prefer a node X that with descendants (a, b, c, d) over a node Y with (a, b), it is still possible for you never to reach a or b via X
22:13 codiferous because other trees may conflict with the relationships accessible from X that would lead to a and b, but you haven't seen them yet
22:14 codiferous we have been talking here about an idea for walking backward from the tips that would solve this
22:17 codiferous and in the trivial case where there is no ranking, it works in the examples we've looked at, which include your recent examples as well as a more complex example that still produces non-optimal output with the root->tip code (even if it subsumes)
22:20 codiferous but ranking introduces some more challenges that we are trying to work out. happy to do video chat at some point if you want a more thorough explanation
22:21 codiferous i will be working on implementing the trivial-case algorithm without ranks, and it should be straightforward to add the ranking to it when we figure out the right way to do it
22:23 jimallman codiferous: hi! mtholder isn’t  here anymore..
22:24 codiferous i know, i just figured this was a good place to leave him a book to read
22:24 codiferous i'm done though, so there shouldn't be any more spam from me today
22:54 pmidford2 joined #opentreeoflife
23:03 codiferous joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary