Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2014-08-29

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:27 kcranstn joined #opentreeoflife
00:57 towodo joined #opentreeoflife
01:21 codiferous joined #opentreeoflife
02:32 jimallman joined #opentreeoflife
03:19 jimallman joined #opentreeoflife
04:18 josephwb joined #opentreeoflife
05:34 guest|74990 joined #opentreeoflife
05:35 guest|74990 left #opentreeoflife
09:20 mtholder joined #opentreeoflife
09:46 scrollback1 joined #opentreeoflife
11:25 josephwb joined #opentreeoflife
11:40 josephwb hey mtholder. what time is it there?
11:49 mtholder hi josephwb, it is 1:48PM in Heidelberg
12:09 josephwb joined #opentreeoflife
12:13 towodo joined #opentreeoflife
12:15 josephwb joined #opentreeoflife
12:43 josephwb joined #opentreeoflife
13:23 codiferous joined #opentreeoflife
13:47 jimallman joined #opentreeoflife
14:06 josephwb i fear I may miss the meeting today
14:06 josephwb i don't think i have much to contribute, tho
14:07 josephwb codiferous know my feelings on things
14:08 josephwb i will only be involved with the treemachine changes
14:08 josephwb as it stands, i am leaving all of the existing plugins present
14:08 josephwb making a separate file for the new ones
14:09 josephwb (not pushed yet)
14:09 kcranstn joined #opentreeoflife
14:11 josephwb only new plugin i have is "node_status", which queries a node, tells whether it is in the graph, in the synthetic tree, number of children, source tree support, etc.
14:12 josephwb could include "trees that do not support", as soon as we formalize what that means
14:13 josephwb treemachine changes will be on new branch "plugins"
14:35 jimallman node_status sounds cool
14:45 towodo joined #opentreeoflife
14:51 towodo joined #opentreeoflife
14:52 josephwb node_status *is* cool ;-)
14:52 josephwb idea came from codiferous
14:54 towodo joined #opentreeoflife
15:03 codiferous jimallman, we are meeting now on g+ if you can come
15:03 jimallman i’m there!
15:03 jimallman here, wherever
16:10 towodo joined #opentreeoflife
16:17 kcranstn doc here: https://docs.google.com/document/d/1N-DN5Og9hcVvFk0BaixkIzqc4oLbjKDhMZSRSGorEVE/edit
16:18 kcranstn start by adding descriptions? and phlesystem methods?
16:29 kcranstn codiferous - I guessed at tree vs graph
16:30 codiferous can we use draft_tree instead of just tree? i think tree is a little too general
16:30 kcranstn but tree is our primary product
16:30 kcranstn we are building the tree
16:31 kcranstn anyone else? thoughts?
16:31 codiferous but there are many trees. we always call it the synthesic tree, or draft tree when we talk about it
16:37 kcranstn draft_subtree can return either a subtree, or the whole tree?
16:37 kcranstn oops draft_tree
16:38 codiferous yeah, joseph pointed that out
16:38 codiferous so we may want to retire the "subtree" method since "draft_tree" makes it redundant
16:43 kcranstn what is the return format for mrca?
16:46 codiferous json, containing a bunch of information about the mrca or if mrta (taxonomy only) depending on the request
16:48 kcranstn for graph/source_tree, why are the identifiers different that phylesystem?
16:48 codiferous do trees have unique identifiers in phylesystem?
16:49 kcranstn I sense a rabbit hole that we’ve been down before. Forget I asked
16:51 codiferous they could be anything, that format is not intended to be stable... if we give them unique ids at some point, we can switch to those easily
16:51 kcranstn is there not a method to get the list of studies / trees?
17:00 codiferous we proposed adding it to the graph/info method. maybe it makes more sense to keep it separate?
17:00 josephwb yes, there is a method to get studies / trees
17:00 codiferous there are two lists of interest:
17:00 codiferous 1. all the source trees in the graph
17:00 codiferous 2. the source trees in the synthetic tree
17:01 codiferous currently these lists are always the same
17:01 josephwb kcranstn: did you mean from treemachine?
17:01 codiferous yes, a method exists now in treemachine
17:01 kcranstn I didn’t see it on the list
17:02 josephwb ok, i will need to catch up on this discussion at some point
17:17 kcranstn are thre cases where the mrta from taxomachine would differ from the mrta in treemachine?
17:17 josephwb yes
17:17 kcranstn because something got filtered out of ott upon import?
17:18 josephwb possibly
17:20 codiferous because treemachine may use a different taxonomy than taxomachine
17:20 codiferous they aren't linked
17:21 josephwb oh, yeah, that could be a problem too
17:21 josephwb but taxonomy space will always be smaller in treemachine
17:21 kcranstn any reason why they shouldn’t always be the saem
17:21 kcranstn same
17:21 josephwb ideally they would keep in step
17:21 kcranstn version in taxomachine = version in treemachine
17:22 kcranstn ok
17:22 codiferous yeah, ideally they should be most of the time, but can't be guaranteed. for release, they *really* should be, but on dev they frequently differ
17:23 kcranstn trying to balance an api that is accurate and provides access to what people want with one that is contains all possible methods  but requires detailed understanding of our architecture
17:23 josephwb kcranstn: mrta still uses a synthesis walk. So, if taxonomy is strongly discordant with synthesis, mrta could be different in treemachine and taxomachine
17:23 codiferous huh?
17:24 josephwb i.e. you first find MRCA. then, if necessary, walk back more until you find a named node
17:24 codiferous oh, that's what i thought it did originally. in that case, it's just mrca, with additional info about the closest ancestral taxon
17:24 codiferous mrta would mean walking taxonomy, not synthesis
17:24 josephwb right
17:24 josephwb just taxonomy?
17:24 josephwb ok
17:25 josephwb hmm...
17:25 codiferous so we can just have mrca for now. i think we can just include the closest ancestral taxon info in the results
17:25 kcranstn so the argument to tree/mrca really just specifies the return - neo4j node id vs first ancestor with ott id
17:25 kcranstn we just said the same thing
17:26 codiferous lol
17:26 codiferous if people need treemachine to return a true mrta for some reason, we can add that later
17:27 codiferous it's very low overhead to find the closest ancestral taxon, so i think we can simplify by always doing this and then there is no need for the argument
17:27 josephwb so, don't specify "taxonomy" vs "synth"
17:27 kcranstn I already deleted tat
17:27 kcranstn that
17:28 josephwb you move too fast
17:28 kcranstn it feels to me like we are making more changes that what we talked about on the call (i.e. will require more work to make this documentation match with the code)
17:29 codiferous which ones?
17:29 kcranstn for example, the return info on mrca
17:29 codiferous i don't think it will be more work to update the docs and i am prepared to do it. they need a pretty extensive overhaul anyway
17:29 josephwb kcranstn: so jimallman's alternative tests in the curator would only have to query once instead of twice
17:29 codiferous yes
17:29 josephwb right, I am happy to do the treemachine stuff
17:30 josephwb ok, so i will keep jimallman in mind when changing things
17:30 kcranstn codiferous - other way around. Worried about making the code match this google doc
17:30 josephwb of course, old services will be there
17:30 josephwb for a while
17:30 josephwb kcranstn: easy peasy
17:30 codiferous ah, i don't think that will be much work. for that service, it's just removing a conditional
17:31 kcranstn josephwb - on the call, we decided that we would ask for input on the software list (i.e. hackathon participants) before making any changes
17:31 josephwb ok
17:31 josephwb i have stuff on another branch to play with
17:31 kcranstn so this doc is the proposal that we will send to the list
17:31 josephwb oh
17:31 josephwb good idea
17:31 josephwb i would like to get this done *well* before the hackathon
17:32 kcranstn deadline for feedback = next wed
17:32 josephwb we have a deadline for feedback?
17:32 josephwb TOO FAST!
17:32 codiferous not much time to work with
17:32 josephwb codiferous?
17:32 * jimallman is reading now, trying to understand how this can be one query for both...
17:32 josephwb for them, or us?
17:32 codiferous for all of us
17:33 josephwb jimallman: you query getMRCA with either "taxonomy" or "synth"
17:33 josephwb it will return both now
17:33 josephwb you call it twice, yes?
17:33 jimallman oh, nice!
17:33 kcranstn need some documentation about the tnrs methods at the top of the doc
17:33 jimallman josephwb: yes, the UI (on devtree) has two separate tests now
17:34 josephwb what I would like feedback on is the proposed "node_status" service. What would people want beyond what I have listed?
17:34 josephwb jimallman: right. now you just need one button!
17:35 jimallman i was hoping we’d sort out which was the more useful test. sounds like y’all have done that.
17:35 josephwb for curation, definitely the taxonomy flavoured one
17:35 kcranstn that discussion is secondary to the API docs
17:35 * jimallman is bowing out with his tangent in his hands
17:36 kcranstn codiferous - those first two couple use examples (the context queries)
17:36 codiferous ok, working on it
17:36 kcranstn I changed the first one
17:37 codiferous the first one?
17:37 codiferous oh, i see
17:37 kcranstn I updated the desciptions - can you check?
17:37 kcranstn thanks
17:38 kcranstn cool, those are clearer now
17:38 josephwb should we organize the services by DB so we know what changes need to go where?
17:38 kcranstn nope
17:38 josephwb huh?
17:39 kcranstn (trying to keep them organized by concept so easier for users)
17:39 josephwb even though the services might plug into different DBS?
17:39 codiferous they are grouped by the plugin to which they belong
17:40 codiferous and the plugin belongs to a db
17:40 josephwb right
17:40 codiferous we (ot dev crew) just have to fill in that info in our heads, it's not likely to be of interest to clients/hackathoners
17:41 josephwb can we get "ot dev crew" shirts?
17:41 josephwb orange, please.
17:41 josephwb hardhats, too
17:41 codiferous haha, yes
17:41 kcranstn safety vests?
17:41 josephwb the safetyest
17:41 codiferous with the construction guy icon
17:41 josephwb word
17:42 josephwb i thin kthis is confusing:
17:42 josephwb graph methods for graph of life (used to build the draft tree)
17:42 josephwb they can't build anything, right?
17:42 josephwb only query
17:43 codiferous "contains the draft tree?"
17:43 josephwb yes
17:43 josephwb good
17:43 josephwb or draft tree DB
17:43 josephwb (they can get stuff that is not in the draft tree)
17:44 codiferous hm. i think we should enforce the concept of a graph, it is important to understanding what we provide
17:45 codiferous it is a superset of the draft tree
17:45 josephwb sounds good
17:49 kcranstn ok, just study methods left to document
17:51 jar__ joined #opentreeoflife
17:52 kcranstn one point about this doc vs the v1 docs - we should indicate which methods on the google doc correspond to the v1 wiki page
17:53 kcranstn to pre-empt the question “where did method X go?"
17:53 codiferous yes
17:54 codiferous that info is the second half of the wiki page where you copied these names from, but we will need to do some remapping now that we've edited
17:54 codiferous here is a question: we have two methods to extract a subtree:
17:54 codiferous (1) complete subtree below a node
17:55 codiferous (2) pruned subtree containing only a set of identified tips and their relationships to one another
17:55 josephwb yes
17:55 josephwb i don't see the latter in there
17:55 josephwb we need it tho
17:55 codiferous do we want those both accessible from the same service (methods indicated by supplied arguments), or do we want different services
17:55 kcranstn I think different services
17:55 josephwb i don't think so
17:56 kcranstn or, this could be a question that we pose on the doc
17:56 codiferous certainly simpler for now to have diff services
17:56 josephwb i think we used to have "getPrunedSubtreeForTips", or something equally cumbersome
17:56 codiferous we are likely to get a lot of feedback about subtree methods (see arlin's issues in the hackathon wiki), so we will probably want to rehaul later no matter what
17:57 codiferous how about "exclusive_subtree" for the pruned case
17:57 josephwb i think "subtree" is fine
17:57 codiferous we can rename the current "tree" (inclusive case) to "subtree"
17:57 josephwb i don't think of a complete tree below a node as a subtree
17:57 kcranstn I like “pruned” rather than “exclusive”
17:58 codiferous pruned is vague though, what are we pruning?
17:58 codiferous exclusive indicates it only contains the identified tips
17:58 josephwb but subtree connotes complete below some node (to me anyway)
17:58 codiferous a subtree is just a tree that is a subset of some other tree
17:58 josephwb think of pruning like in gardening: remove tips/branches
17:59 kcranstn that’s not what exclusive says to me. I expect a high-end subtree, with glittering gemstones
17:59 codiferous hahaha
17:59 codiferous we should definitely have that
18:01 kcranstn most R methods use ‘pruning’ to remove tips from trees
18:01 codiferous in different themes, like "madonna" and "gucci"
18:01 josephwb "bonsai_tree"?
18:01 kcranstn liberace
18:01 jimallman re: mapping of old to new names, i’d suggest adding the old name alongside the new. i tried this for tnrs/contexts…
18:01 codiferous right, if wanted to indicate which tips to prune, i think saying "pruned" makes sense
18:01 codiferous but we are indicating which tips to include
18:01 kcranstn yes, thanks jimallman
18:01 codiferous pruned_subtree sounds like a different method to me, where we say give me subtree X, but prune tips r,s,t,v
18:01 codiferous exclusive just means exclude everything except tips a,b,c,d,e
18:02 josephwb hmm, i don't like exclusive
18:02 josephwb i understand what you mean, i just don't like it
18:02 codiferous better word than exclusive?
18:02 josephwb i thin kor prungin as "pruning to", not "pruning of"
18:03 josephwb spellcheck, please?
18:04 codiferous opposite of "inclusive"
18:08 codiferous it's a pruning method, i suppose. arlin has requested some others
18:08 codiferous currently we only have two, but it would be nice to think about how we might identify them once we have others
18:09 codiferous pruned_subtree for instance, could also apply to a randomly-pruned subtree, or a subtree pruned some arbitrary algorithm
18:10 codiferous so maybe we call the method "pruned_subtree" and currently we just implement the "exclusive" pruning method, which requires a set of ids to be identify included nodes?
18:13 kcranstn In general, I am not sure if it is better to have one method where the structure of the arguments defines behaviour, vs different methods for different flavours of similar operations
18:13 kcranstn (I managed to get two Canadian spellings in there)
18:14 codiferous colourful use of language
18:14 codiferous we could easily have proliferation of subtree methods
18:14 kcranstn imposter!
18:14 codiferous :p
18:15 codiferous most apis i've seen try to keep access points simple, and provide lots of control over behavior within them
18:15 jimallman imposteur!
18:15 kcranstn in my neck of the woods, we would simply say poser
18:15 jimallman hoser
18:15 kcranstn that too
18:16 jimallman kcranstn: thoughts on the alternate format used for taxonomy/subtree?
18:17 kcranstn jimallman - I think we only have to provide the alternate name if it was on the wiki page
18:17 kcranstn i.e. for methods described https://github.com/OpenTreeOfLife/opentree/wiki/Open-Tree-of-Life-APIs
18:17 jimallman good call! i’ll check for that. and that makes the alternate format above more of an exception
18:17 kcranstn yup
18:18 codiferous i like the alternate format, keeps the association between old name/new name strong
18:18 codiferous and easy to see where there was no old name
18:19 kcranstn agreed
18:30 kcranstn codiferous - what if all of those studies/find* were studies/query/somethign
18:30 kcranstn see http://www.gbif.org/developer/occurrence
18:37 codiferous fine with me. just means some added complexity (not much) to the apache redirects
18:38 kcranstn hold off
18:38 kcranstn thinking
18:39 codiferous or maybe we just want "search"
18:39 codiferous like search/studies
18:39 kcranstn no
18:39 kcranstn definitely want the object first
18:40 kcranstn I am just thinking that search might be redundant, i.e. api.opentreeoflife.org/studies?author=Smith
18:40 kcranstn depends on implementation
18:41 codiferous right. well we're currently limited by the neo4j interface
18:41 codiferous so mirroring gbif's studies/search/{property}
18:41 codiferous would be a lot of work at the moment
18:41 kcranstn fair enough
18:41 kcranstn let’s not propose that, then
18:41 kcranstn trying to keep this round easy
18:44 codiferous jimallman, do you use oti's get all studies method?
18:44 * jimallman is checking now...
18:45 jimallman oti/v1/findAllStudies   …? yes, definitely
18:45 codiferous ok
18:45 codiferous what is the difference between that and phylesystem/study_list?
18:46 jimallman i had forgotten about that (or never knew), looking now…
18:49 jimallman phylesystem/v1/study_list returns a slighly simpler response, just an array of study IDs, vs an array of tiny objects from oti/v1/findAllStudies… it appears to pull these directly from the local phylesystem repo, through api_utils
18:50 codiferous do we want to document both?
18:50 jimallman study_list definitely seems like “internals”.. Mark might know if it has a distinct use case.
18:50 codiferous ok
18:51 * jimallman is grep’ing now to see who else calls study_list
18:55 codiferous kcranstn, what if we did study/search, tree/search, and node/search, and study/properties, tree/properties, and node/properties
18:55 jimallman it looks like gcmdr calls study_list in load_studies.py
18:55 kcranstn that works for me
18:55 codiferous ok
18:56 jimallman codiferous: peyotl also appears to use study_list when sync’ing with phylografter, etc.
18:58 codiferous ok, sounds like we can put it in the "dev only" category, so not documented
19:06 kcranstn codiferous - I misundstood your question, and turns out I don’t agree
19:06 codiferous haha, ok
19:07 jimallman agreed (esp. as it’s potentially misleading, reading as it does from a “local” phylesystem repo)
19:07 kcranstn this Q: “what if we did study/search, tree/search, and node/search, and study/properties, tree/properties, and node/properties”
19:07 codiferous just trying to think of ways to avoid awkward urls like study/find_trees
19:11 codiferous seems like the restful url model doesn't really make sense applied to the oti queries
19:14 kcranstn I think the restful structure would be /study?node=property
19:14 kcranstn e.g. https://www.nescent.org/wg_evoinfo/PhyloWS/REST#PhyloWS_REST_Specification
19:16 kcranstn although note the use of find/object...
19:16 kcranstn hmmm
19:18 kcranstn we need input from people more familiar with API standards
19:19 jimallman kcranstn: i think your implicit query /study?color=blue   pattern is sensible. it would be moreso of /study returned the list of all studies (summary response, or same as findAllStudies)
19:20 kcranstn codiferous - the description for /properties and /search don’t seem to match. The former states “within studies” and the latter indicates across studies
19:21 kcranstn i.e. it is unclear if study/tree/search will search across all study, or only within a study
19:21 jimallman ah, and i see the PhyloWS uses the more explicit /blah/?query=….
19:22 kcranstn noting that phylows has never actually been implemented...
19:22 kcranstn but lots of discussion from people who think about API development
19:25 kcranstn what node and tree properties can you currently search against?
19:28 jimallman curl -X POST http://api.opentreeoflife.org/oti/ext/QueryServices/graphdb/getSearchablePropertiesForStudies
19:28 jimallman curl -X POST http://api.opentreeoflife.org/oti/ext/QueryServices/graphdb/getSearchablePropertiesForTrees
19:28 jimallman curl -X POST http://api.opentreeoflife.org/oti/ext/QueryServices/graphdb/getSearchablePropertiesForTreeNodes
19:29 kcranstn thanks
19:30 jimallman those were pasted from an old email… also available from more sensible URLs, for example:     curl -X POST http://api.opentreeoflife.org/oti/ext/QueryServices/graphdb/getSearchablePropertiesForTreeNodes
19:31 jimallman grrrr. clipboard problems… curl -X POST http://api.opentreeoflife.org/oti/v1/getSearchablePropertiesForTreeNodes
19:33 * jimallman just noticed that treemachine is down on devapi…
19:34 codiferous jim, re: the format /study?color=blue, we could probably do this now by remapping internally using apache to the oti queries
19:34 codiferous but i don't know how much work that is, or whether its worth it at the moment
19:35 jimallman i know that towodo is hesitant to get too frisky with apache configuration, if we can find another way.. but this melding of oti and phylesystem-api is a case where we should consider it.
19:36 kcranstn he just got back from a meeting, so should have input soon
19:36 codiferous is that worth attempting before the hackathon?
19:36 towodo hi
19:36 codiferous aloha
19:36 towodo haven’t read through all of the above conv
19:36 codiferous trying to find a solution to the url setup for queries currently in oti
19:37 codiferous the proposed study/find_trees model is not ideal
19:38 jimallman more generally, we’re talking about a user-facing meld of the existing oti and phylesystem APIs, right?
19:39 jimallman different styles and (former) paths, but common focus on the study NexSON
19:39 towodo oti and phylesystem are logically the same, unfortunately
19:39 jimallman i think it’s a good idea, if they’re two sides of a coin
19:40 kcranstn jimallman - unpack that thought
19:40 towodo so what am I looking at - /study/tree/search ?
19:40 kcranstn should this be study/search?tree=property
19:40 kcranstn or study/search/tree/
19:40 kcranstn or study/tree?property=value
19:41 towodo the question should be, what is the resource?
19:41 towodo e.g. for /taxonomy/x, the resource is the taxonomy
19:41 towodo for /tnrs/x, the resource is the taxonomic name resolution system
19:41 towodo for /study/, I don’t know what the resource is
19:41 jimallman kcranstn: unpacking… we started on this path when we decided to go with git for data storage, with oti as its indexing partner. they’ve always been siamese twins.
19:42 kcranstn why does study = resource not make sense
19:42 kcranstn ?
19:42 towodo the resource is the phylesystem.  it’s not a study
19:42 towodo it’s a bunch of studies
19:42 jimallman unpacking (cont’d):  the fact that they use two (very) different technologies is an implementation detail, so we’re bringing them back together in the user-facing API
19:42 kcranstn ok, that makes sense
19:42 towodo I don’t mind calling it /phylesystem/
19:43 kcranstn me neither
19:43 kcranstn so, then what do the URLs look like for searching studies, trees, notes, etc
19:44 towodo three-level names bug me… unless the phylesystem can be said to have parts, which I don’t think is the case
19:44 jimallman and should we attempt RESTful style for all services starting /study ? if so, a common pattern is for GET /study to return a list of  all studies, and a query-string would be interpreted as filtering that list
19:45 towodo hmm.
19:45 codiferous "it’s a bunch of studies " <- yes
19:45 jimallman towodo: we already have subresources URLs in the phylesystem API, see /study/{ID}/tree/{ID}
19:45 towodo yes, but those are legitimate whole / part relationships
19:46 jimallman to me, these could be interpreted as scoped searches: /study/{ID}/tree?id=tree456
19:46 jimallman (ok, that one’s lame, but some other kind of tree query within a study)
19:46 codiferous but how do you then search for trees across all studies?
19:47 codiferous that query looks like it should only apply within a study
19:47 jimallman that was my intent, yes
19:47 codiferous i see
19:47 jimallman so more generally, maybe something like    /tree?has_branch_lengths=true
19:48 jimallman so far, we haven’t elevated any other resource type to “first-class” status… only ‘study’
19:49 jimallman (in URLs, i mean)
19:49 towodo I don’t have my head around this collection of methods
19:49 jimallman it’s a tricky mix of styles, imo (RESTful and RPC)
19:49 codiferous we could simply handle it all through one entry point: phylesystem/search
19:50 codiferous parse queries internally and map them to the necessary methods
19:50 jimallman true, and that removes some of the voodoo with sub-resources
19:50 codiferous nontrivial amount of work before the hackathon
19:51 towodo I’m not necessarily going for total REST, I just think that / should reflect whole/part relationships
19:51 jimallman agreed
19:51 kcranstn we don’t have to have everything done before the hackathon - some of the searching across studies can be easily done from bash with grep
19:52 kcranstn We can have a section of proposals that we don’t promise before the event
19:52 towodo does tree/search search all trees in all studies?
19:52 kcranstn yes
19:53 towodo seems an odd sort of thing, what’s a use case? do we need to document it at all?
19:53 kcranstn we could strip those out for now
19:54 towodo node/search can be used to find trees containing a taxon, does it have any other uses?
19:54 towodo I think I would just change / to _    tree_search, node_search    or find_trees, find_nodes
19:56 codiferous it finds studies with matching trees
19:56 codiferous similarly, node search finds studies with trees with matching nodes
19:56 towodo I know that’s what it does. my question was what the use case was
19:56 towodo what’s it good for ? I can’t imagine
19:57 codiferous node search is probably not very useful, tree search is used for things like taxon searches
19:57 codiferous not all trees in a study contain all taxa
19:58 codiferous or type of support values (though we don't have any, minor detail)
19:58 towodo the summary says it looks for trees with properties, not trees containing taxa
19:58 codiferous anything specific to trees
19:58 codiferous any kind of property, taxon names and ott ids are available properties
19:58 towodo where I would assume property = nexml property
19:59 codiferous it's a subset of the ot:* property set
20:00 codiferous supported properties differ for trees vs. studies vs. nodes
20:00 codiferous curl -X POST http://api.opentreeoflife.org/oti/ext/QueryServices/graphdb/getSearchablePropertiesForTrees
20:01 towodo I still can’t imagine how something searching for properties of trees, could possibly find trees containing taxa. is whether tree T contains taxon X a property of tree T, for every X?
20:01 codiferous no, the use of "property" here is vague
20:02 codiferous in the case of taxa, it's whether a node in T is mapped to X, or to a descendant taxon Y of X
20:03 codiferous internally, the taxon names and ott ids are stored as properties of the tree root nodes, but that is an implementation detail
20:04 towodo I’m just saying that since the description talks about properties, and property is a pretty particular idea in both programming and knowledge representation, it would be better to redescribe.  maybe ‘characteristic’ and give finding taxa as an example.  not saying we need lots of detail, but it has to be not misleading.
20:04 codiferous i have no problem with that
20:05 codiferous though "characteristic" seems like an odd word to use
20:05 codiferous i'm not attached to "property" though
20:05 codiferous but the pressing question i think is about the url(s) for searching studies/trees/nodes
20:07 towodo we get rid of node search
20:07 codiferous good
20:08 codiferous might need to provide some mechanism for this in the future, but we certainly have no use cases right now
20:08 towodo I mean, we can keep it in the code, but not document it, if maybe there is a use case we haven’t thought of
20:08 towodo I don’t know whether ‘find’ or ‘search’ is better.  ‘find_trees’ seems good to me.  that would implie ‘find_studies’
20:09 codiferous e.g. "v2/phylesystem/find_studies"?
20:09 towodo or even just ‘trees’ and ‘studies’
20:09 towodo now I want to look at other APIs to see what they do
20:11 codiferous and what do we do about providing a list of properties/characteristics that can be used for search?
20:12 towodo what’s there seems ok
20:16 towodo well maybe tree_search_properties
20:16 towodo I’m not sure why it’s dynamic rather than part of the documentation
20:17 codiferous there are different properties for trees and studies
20:17 towodo yes… that could be documented
20:17 codiferous doesn't need to be dynamic, was just convenient self-documentation
20:17 codiferous it shouldn't change much now
20:18 towodo I don’t see any other way to use these methods, than as getting documentation preparatory to writing code.
20:18 towodo in which case they probably shouldn’t be in the API doc as methods. (their results should be in the doc)
20:19 codiferous yes
20:26 codiferous kcranstn, is the "dev only" list still available somewhere? it has some notes/comments attached that we should not lose
20:26 codiferous at least, until the changes are made and reflected on docs
20:27 kcranstn the note about adding the parameters to the search method? we need to do that for all methods, so I didn’t think we needed a specific note for that one
20:27 kcranstn you can always get history back, too
20:28 kcranstn Just trying to clean up the doc before we send it out
20:28 codiferous they aren't parameters, rather a list of possible values for the parameters. i can just try to remember. joseph also had some comments about some other thing. can we just put that aside in a doc in the software folder for now? i can delete it later when it becomes obsolete
20:28 codiferous agree it doesn't belong in this doc
20:29 kcranstn pasted it back. You can move it elsewhere
20:30 towodo I’m not too keen on taxonomy/deprecated_taxa. If someone wants this list they should get it from the taxonomy dump
20:30 kcranstn done
20:30 kcranstn what about flags?
20:32 towodo not sure what the use case for that is. Cody?
20:32 codiferous i added it because you asked me to
20:32 codiferous i don't know of any client use cases, so probably take it out of the public api
20:35 towodo we have two cases of the pattern A/X + A/B_X.  I think I might prefer A/C_X + A/B_X… e.g.
20:35 codiferous ?
20:35 towodo oops it changed.  /taxonomy/taxonomy_info and /taxonomy/taxon_info
20:35 codiferous example?
20:36 codiferous oh
20:37 towodo removing _info from taxon_info is odd.  the method doesn’t return the taxon, it returns information about the taxon.
20:37 codiferous graph/node_status is the same form, just switch to graph/node ?
20:37 codiferous ok, it can go back
20:37 codiferous seemed simpler, but i don't care
20:37 codiferous then node_status should be node_info?
20:38 towodo I would think so
20:38 towodo I could see going the other way - it’s very common in programming to confuse the thing with the representation of information about the thing
20:39 towodo e.g. we do that with ‘study’
20:39 towodo and ‘tree’
20:39 codiferous yeah
20:40 towodo but somehow to me saying the method returns the study or tree seems natural, but saying the method returns the taxon or node doesn’t.
20:40 codiferous the use of info adds a tiny amount of tedium, but is clearer
20:40 codiferous so that is fine with me
20:41 kcranstn I need to head out and pick up cat food before the store closes
20:42 kcranstn have a good weekend, everyone!
20:42 kcranstn will send this out later today
20:42 towodo ok
20:42 kcranstn but will check irc logs first
20:42 kcranstn in case of OMG please not yet!
20:42 codiferous you too, thanks karen!
20:48 codiferous so, what do you think about the phylesystem search urls?
20:48 codiferous for studies and trees
20:48 towodo you changed them I think
20:49 codiferous they were phylesystem/find_studies, etc. i think
20:49 towodo they’re ok.  if someone hates them that will come out in review
20:49 codiferous ok
20:50 codiferous the other weird item is the tree/subtree vs tree/pruned_subtree
20:50 codiferous there was quite a discussion about this. no great solution apparent
20:51 towodo personally I think I’d have a different method for each kind of pruning. merging into one method has no value IMO
20:51 codiferous ok
20:52 codiferous currently we have one pruning method (described in the doc)
20:52 towodo I always say ‘induced subtree’ but nobody else has picked up on that, so I don’t push it
20:52 codiferous what do we call that method? i advocated against "pruned_subtree" because many other types of subtree pruning methods have been requested
20:53 towodo well, ‘induced’ was my suggestion
20:53 codiferous is that method technically an induced subtree?
20:53 towodo it the subtree induced by a set of tips - is ‘induced’ used technically in a different way in the literature?
20:54 codiferous i've seen it but i don't know the definition
20:57 codiferous "a subset of the vertices of a graph G together with any edges whose endpoints are both in this subset."
20:58 towodo well my meaning is just made up. other words: ‘implied’, ‘spanning’, ‘over’
20:59 towodo spanning_tree
21:00 josephwb i think "pruned" is pretty standard
21:00 codiferous spanning is closer, but we excise vertices of degree 2
21:00 josephwb for phylo-dorks
21:00 towodo I was just saying that…
21:00 towodo if the masses like “pruned”, give them “pruned”
21:00 codiferous we already have requests for different types of pruned trees
21:00 josephwb link?
21:01 towodo I see, ‘pruned’ isn’t consistent enough?
21:01 josephwb "subsampled" would also work
21:01 towodo oops not consistent, specific
21:01 josephwb i think it is
21:01 codiferous https://github.com/OpenTreeOfLife/hackathon/issues/15
21:02 towodo you think ‘pruned’ is specific enough? and we should use a different word for the other kinds of subtree extraction?
21:02 towodo arlin doesn’t use the word ‘pruned’
21:02 josephwb *reading link*
21:02 codiferous i do not think pruned is specific enough
21:02 codiferous arlin doesn't use it, but he's requesting pruned subtrees
21:03 codiferous pruning is an operation that can be done in a variety of ways
21:03 josephwb he is talking about sampling, not pruning
21:03 josephwb pruning = deterministic
21:03 josephwb you want X, Y, Z
21:04 codiferous spanning_subtree is much more accurate in this case than pruned_subtree
21:04 towodo arlin is talking about choosing a set of tips. having chosen the tips one then might want a tree over those tips.
21:04 josephwb you know exactly the leaves in the returned tree (indeed, you ask for them!)
21:05 josephwb sorry, have no idea what "spanning_subtree" is
21:05 codiferous jonathan proposed it
21:05 josephwb *looking in log*
21:05 towodo ‘subtree’ is borderline problematic, it means something different in some computer science papers
21:05 codiferous it is a term from graph theory. i've seen it in the literature
21:05 codiferous http://en.wikipedia.org/wiki/Spanning_tree
21:06 towodo ‘spanning tree’ is a term any computer science undergraduate knows. it is an algorithms 101 thing
21:06 codiferous in phylogenetics, it means a subset of a tree
21:06 codiferous subtree, that is, not spanning tree
21:07 josephwb are we catering to phylogeneticists or graph people?
21:07 codiferous or it can indicate this weirdly undefined "induced subtree" that we're talking about
21:07 josephwb i understand, but seems like jargon
21:07 codiferous hm, i'm not sure what the distinction is between jargon and common usage here
21:08 towodo we are catering to intelligent people who use words carefully and are willing to learn new things if necessary
21:08 codiferous i do not think biologists will be confused by the use of "subtree"
21:08 josephwb no, that is standard
21:08 towodo fine.
21:09 codiferous is spanning_subtree acceptable?
21:09 towodo ‘pruned subtree’ has very few google hits
21:10 towodo I’m having second thoughts about ‘spanning’, which is mainly used in the sense of a spanning tree of a graph
21:11 josephwb forgive my lack of computer 101, but "spanning" seems to connote "all of the taxa I asked for, plus the intervening one"
21:11 josephwb ones
21:11 josephwb but in this case, you *only* want to requested taxa
21:12 towodo in CS it means “the smallest one necessary to cover those things and only those things”
21:12 josephwb but in this case, you *only* want the requested taxa
21:12 josephwb ok
21:13 codiferous i am fairly sure i have seen it used in phylogenetics literature, likely from steel and associates
21:13 codiferous i think it is very close to what we are doing
21:13 josephwb sure. but, come on. why not call them X-trees then.
21:13 codiferous we are presenting the spanning tree with the vertices of degree 2 excised
21:13 josephwb his papers are not standard
21:13 josephwb not widely read by potential users
21:14 josephwb "sampled_tree" seems unambiguous for what Arlin wants
21:14 towodo but so far we haven’t found a single standard term - and barely any term at all - for this concept.  any terminology we use will be nonstandard.
21:14 josephwb "pruned_tree" is, well, a pruned tree
21:15 josephwb "pruned_to", not "pruned_of"
21:15 codiferous it is not as precise
21:15 josephwb maybe that is confusing?
21:15 josephwb but i don't think so.
21:16 towodo I think we should find a single published source that uses some particular terminology, and choose that.
21:16 codiferous yes, i think it is confusing, i would imagine that to request a "pruned tree" you would specify the things to prune
21:16 josephwb brian omeara might have a reference
21:16 josephwb what?
21:16 towodo http://hongqinlab.blogspot.com/2012/12/phylogeny-using-biopython.html
21:17 codiferous i agree, if there is a definition for this we should use it
21:17 towodo https://phenoscape.org/wiki/Needs_Analysis_Workshop/Report-out_Day2
21:18 towodo I know these are not publications
21:18 towodo http://www.pnas.org/content/106/44/18621.full
21:19 codiferous a pruned tree is a tree with some clades/tips removed
21:19 towodo http://rpubs.com/bw4sz0511/prune
21:19 josephwb http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3187300/
21:20 josephwb With a set of pairwise distances that describe the degree of dissimilarity among individuals, an MST represents a set of edges (connections) that link together nodes (individuals) by the shortest possible distance.
21:20 josephwb MST = minimum spanning tree
21:20 josephwb contains all taxa
21:20 towodo yes, that
21:20 towodo yes, that’s why i had second thoughts about ‘spanning'
21:20 towodo we could say ‘tree_from_selection’
21:21 towodo or ‘subtree_from_selection’
21:21 towodo and sidestep the whole <adjective> pruned problem
21:21 josephwb but that is what we had already
21:21 towodo no, we had ‘pruned'
21:21 josephwb getSubTreeForNodes, or something similar
21:22 josephwb i mean existing stuff
21:22 codiferous either of those would be fine with me
21:22 towodo or ‘subtree_with_tips’
21:22 towodo I don’t know, you’re the phylogeny guys
21:22 josephwb ha!
21:24 towodo subtree_from_tips    [not keen on ‘nodes’]
21:24 josephwb but it doesn't have to be tips
21:24 towodo they will end up all being tips in the result, yes?
21:24 josephwb not as it exists, anyway
21:24 towodo oh, I guess some might be internal…
21:24 josephwb i don't know
21:25 josephwb what do people want
21:25 josephwb yes
21:25 towodo well, the word ‘selection’ is designed to bypass the node/taxon/tip question
21:25 josephwb we can constrain to tips, if we want
21:25 codiferous how about embedded_subtree
21:25 towodo no need
21:26 josephwb "embedded_subtree"? really?
21:26 towodo grumble.  what don’t you like about ‘subtree_from_selection’ ?
21:26 codiferous that's ok with me
21:26 codiferous didn't think we had settled
21:26 towodo ok, ‘subtree_from_selection’ wins
21:27 towodo can I delete ‘may implement any variety’ and following?
21:27 towodo gfi
21:28 codiferous already done
21:30 codiferous so, dendroscope call it an "induced subnetwork"
21:30 towodo ha!
21:31 towodo looking at ‘the entire graph database’ - that’s going to be confusing to some people, how is that different from the phylesystem?
21:32 codiferous it doesn't contain all the studies from phylesystem
21:32 codiferous just the ones going into synthesis
21:32 towodo I know that. I’m saying the prose needs crafting. I’ll give it a try
21:32 codiferous ok
21:34 codiferous shall we be consistent with dendroscope and use induced_subtree instead of subtree_from_selection?
21:34 josephwb meh, i like "targetted_subtree" over "subtree_from_selection"
21:35 towodo I’m ok with that (have been advocating for that term for months)
21:35 towodo the subtree induced by a set of nodes.  makes perfect sense to me
21:35 codiferous i'm ok with it too
21:36 towodo (‘that term’ = ‘induced’)
21:36 codiferous yes
21:37 towodo what do you think?
21:37 codiferous he's feeling pensive today
21:41 towodo I think I’m done for now
21:42 codiferous yeah. i think the doc is done, yes?
21:42 towodo hyperlinks to the v1 documentation would be awfully nice… I’ll see if I can get to it
21:42 codiferous i can do that
21:42 codiferous anything else?
21:43 towodo that would be great… no I think it’s ok for review, we have more time to work out kinks
21:43 codiferous ok
21:43 towodo karen will be sending email, I’ve already talked to her about that
21:44 codiferous yes
21:44 codiferous i am just messing with the doc. no intention of doing anything email related here
21:58 kcranstn joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary