Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2015-02-06

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:02 jar286 joined #opentreeoflife
01:07 jar286 joined #opentreeoflife
01:38 jar286 joined #opentreeoflife
02:39 jar286 joined #opentreeoflife
11:50 jimallman joined #opentreeoflife
12:31 jar286 joined #opentreeoflife
13:21 josephwb1 joined #opentreeoflife
13:24 josephwb1 you there jar286?
13:24 jar286 more or less
13:24 josephwb1 looking at yer issue: https://github.com/OpenTreeOfLife/treemachine/issues/162
13:24 josephwb1 i can add a verbosity arg
13:24 josephwb1 is there stuff you definitely *want* output?
13:25 josephwb1 or do you even care about this stuff?
13:25 josephwb1 it is mostly us looking at the logs
13:25 jar286 I haven’t yet made use of it. It’s more for your use, if someone should complain about the production server.
13:25 josephwb1 understood.
13:26 josephwb1 we don't often even look at it ourselves. was there for troubleshooting problems
13:26 jar286 The latest case was when it was just gobbling up all cpu time, and I didn’t know how to diagnose the problem
13:26 josephwb1 i will turn down the default logging to a minimum.
13:26 josephwb1 what was it doing?
13:26 jar286 The thing I miss most are the POST parameters. these don’t seem to show up anywhere
13:27 josephwb1 is someone querying it?
13:27 josephwb1 a lot?
13:27 jar286 I have no idea what it was doing. Just spinning and disabling the application. I stopped it
13:27 josephwb1 weird.
13:27 jar286 You can look at that log file I pointed you to
13:27 jar286 the end of it might say what it was doing at the time, I don’t know
13:27 josephwb1 yes, i looked.
13:28 jar286 if the logs just gave the POST parameters that might be enough , I don’t know
13:28 jar286 then at least we could try reproducing problems.
13:29 josephwb1 hmm. i recognize a test case in there.
13:29 josephwb1 the line:
13:29 josephwb1 "(((Stellula_ott501678,(Dendroica_ott666104,Cinclus_ott267845)),(Clangula_ott316878,Perdix_ott102710))Neognathae_ott241846,Struthio_ott292466)Aves_ott81461"""
13:29 josephwb1 is the example case for induced_tree
13:29 jar286 people are going through the API docs
13:30 josephwb1 yes, looks like it.
13:30 josephwb1 we don't need to log that stuff
13:31 mtholder joined #opentreeoflife
13:31 josephwb1 hey mtholder
13:31 jar286 but I would think we do need to log POST parameters, yes?
13:31 josephwb1 which parameters are those?
13:32 jar286 umm… the values of the arguments to the plugin methods
13:32 jar286 think about what you would do to debug a very slow request, or one that hangs.  what information would you want?
13:32 josephwb1 oh, i see.
13:33 pmidford2 joined #opentreeoflife
13:33 josephwb1 i will look at it when i get in to the office.
13:34 josephwb1 mtholder: i've got a optimized reprocess-er that works for newicks.
13:34 josephwb1 on branch reprocess
13:34 josephwb1 need to tweak things for nexsons. shouldn't take too long.
13:34 jar286 I think I would put: timestamp, arguments, nonce on call; then timestamp, nonce on return  - the nonce being anything that could be used to match call and return, maybe a short unique id (counter) or even the first timestamp
13:34 jar286 then if there’s a call without a return, we’ve nailed it
13:35 josephwb1 then i am going to reprocess the entire DB from the grand synth, see 1) how many iterations, 2) how long it takes.
13:35 jar286 this could be done for select methods
13:35 josephwb1 potentially a dramatic improvement.
13:35 josephwb1 potentilly, not.
13:36 josephwb1 jar286: yes, that sounds like a good solution.
13:38 mtholder josephwb. thanks.
13:45 josephwb1 joined #opentreeoflife
13:56 jar286 joined #opentreeoflife
14:01 pmidford2 jar286 - ready, I'll wait for your hangout request
14:22 jimallman joined #opentreeoflife
14:56 codiferous joined #opentreeoflife
15:35 jimallman joined #opentreeoflife
16:38 jar286 jimallman, I’m putting together agenda for Tuesday, are there ‘discussion needed’ items that should go on?
16:38 jimallman yes, we should definitely try to nail down the tree properties…
16:39 jimallman that is, label (or labels) and “tree type” and inference method… others?
16:39 jar286 I thought we weren’t going to know these until we got curator feedback?  Or are you talking a first cut.
16:39 jimallman ah, didn’t realize this was pending feedback. i don’t think there’s much point in moving forward without decisions.
16:39 jimallman the UI changes should be trivial, and won’t work without changes to Nexson
16:40 jar286 well let’s raise it in any case.  let’s put out a proposal for first cut
16:40 jimallman OK. other items:
16:40 jimallman https://github.com/OpenTreeOfLife/opentree/issues/510
16:41 jimallman (nicer links to trees in references list? requires oti work, i think)
16:41 jimallman i have a couple of other oti requests outstanding, but they’re not urgent
16:42 jar286 no don’t worry about nicer links
16:42 jimallman ok with link to the tree viewer (in curation app), versus nexson?
16:42 jimallman (nexson download, i mean)
16:43 jar286 hmm, not sure
16:43 jar286 we need versions both for human consumption and machine consumption.  human doesn’t matter so much so long as there are bread crumbs
16:43 jimallman at this point, someone trying to reproduce synthesis would need to connect a lot of dots. including finding the actual versions used in synthesis. all the information is there, but it’ll be much easier when we have a proper collection they can just grab.
16:44 jimallman crumbs we’ve got :)
16:44 jar286 hmm, right
16:45 jar286 so does that make #510 held pending collections?
16:46 jar286 collections then need versioned trees, not just trees
16:46 jimallman hmmmmm
16:47 jimallman i can see arguments both ways. maybe we need the ability to “freeze” a collection? pin all versions and lock from further changes?
16:47 jar286 that could happen by default, if there were notification of update and an easy way to advance
16:47 jimallman sort of a “snapshot”. the collection lives on, but this version is accessible (probably using a git SHA) for reproducibility etc.
16:47 jar286 that works too
16:48 jimallman git tag/release might be a nicer way to show this. should be do-able through the APIs
16:49 jimallman i suppose it’s more of a deliberate action (taking a snapshot) since it means capturing the current SHAs for all elements in the collection.
16:50 jimallman no, i think that’s backwards…
16:50 jar286 remember we need to do this retrospectively, too…
16:50 jimallman once treemachine is using collections as input to synthesis, it should just record the SHA of the collection and all studies used
16:51 jar286 snapshot doesn’t have to be through UI, maybe
16:51 jar286 esppeciallly since the build is scripted
16:51 jimallman right, i’m thinking it’s just a record of git versions captured during synthesis (i believe we already have this, study SHAs at least. all but the existence of a collection)
16:54 jimallman so if synthesis recorded the collection used (and its current SHA), we could easily offer this collection from the synthesis-release page, which we’ll certainly want to do.
16:55 jimallman (with a caveat if the collection has changed in the meantime)
16:56 jar286 not sure the caveat or detection is needed if we’re saying the snapshot feature is for synthesis(es) only. the labeling of the collection should say enough.
16:57 jar286 any, I have #555, #552, #510, #583 on the agenda. what about 554?
16:57 jimallman that’s done (narrowly defined)
16:57 jar286 close?
16:58 jimallman yes, i should have closed this
16:58 jimallman done
17:00 jimallman jar286: so i’m tinkering with caching options for the tree viewer (and possibly more). so far, redis looks like the path of least resistance. these requests don’t go through web2py, but straight to treemachine. since we already install redis on the API server, i’m looking into how to configure apache to route some treemachine calls through redis for caching.
17:01 jar286 ok. I think I’m going to see if mark or stephen wants meeting time to talk about synthesis.  then any remainder goes to front end, except maybe redundancy in there somewhere
17:01 jimallman yes, there’s likely some overlap. i have some open issues in oti, if  we have extra time to fill. :)
17:01 jar286 are there concurrency issues?
17:02 jar286 no one is going to listen to treemachine/oti requests right now.
17:02 jimallman gotcha
17:02 jar286 if you need something done, you may have to do it yourself
17:02 jar286 (we)
17:02 jimallman re: concurrency, the redis config seems to support “n databases”. will investigate this, though.
17:03 jimallman (the implication is that they’re assigned per-connection)
17:07 jar286 jimallman, burroughs is overdue by 1 day. i will push it out a week
17:08 * jimallman nods
17:09 jimallman just realized i could close #524 (done)
17:10 mtholder joined #opentreeoflife
17:15 jar286 hi mtholder, I was thinking about Tuesday’s agenda
17:16 mtholder yes?...
17:16 mtholder are we having a call? Karen will be in Europe.
17:16 mtholder I'm game.
17:16 jar286 should we have time to talk about synthesis? what would be useful from your point of view?
17:16 jar286 we can still have a call if it would be useful.  jim wants some discussion (me too) on some issues
17:16 mtholder that seems like the pressing issue from my PoV.
17:17 jar286 I’ll make it first item then… you can use the time or not
17:17 mtholder OK.
17:22 jimallman mtholder: i’m looking at using redis as the cache for tree-viewer queries. (i see that we already install this on the API server, but it doesn’t seem to be configured for in-memory cache.) Is there any reason i should back away slowly and find another solution?
17:24 mtholder sorry for caching web2py responses?
17:25 jimallman no, these are direct calls to treemachine, like http://devapi.opentreeoflife.org/treemachine/v1/getSyntheticTree
17:25 jimallman so i thought i could configure apache to intercept these calls and pass through redit
17:25 mtholder seems OK to me.
17:26 mtholder since there aren't that many nodes, I think storing them all on the fs and writing a controller that just returns them would work.
17:26 mtholder we'd have to either fill that fs cache or have a forwarding mechanism for 404s
17:27 mtholder (cache misses i should say).
17:27 mtholder but, I found redis easy to use in phylesystem-api
17:27 mtholder so it should not be hard to set up.
17:28 jimallman hm, that’s an interesting alternative. i’m not finding much on simple apache config for redis. i suppose because someone needs to know how to interpret the initial request… there’s an apache module (https://github.com/sneakybeaky/mod_redis) that might make sense here.
17:29 jimallman or i can proxy these calls through web2py and use its caching decorators (easy!)
17:30 mtholder I think all the responses are about 20GB if gzipped.
17:30 mtholder would it be easy for the client code to take zipped responses?
17:31 jimallman good question. i would think so, yes, with appropriate headers.
17:34 jimallman in fact, we should set this up in apache for all responses. it seems we’re not compressing now, even though all our target browsers would support it.
17:45 jimallman i’m going to try proxying through web2py with a disk cache, and see how that works. perhaps a general caching controler, so we can try different methods like so:  http://devapi.opentreeoflife.org/cached/treemachine/v1/getSyntheticTree
17:46 jimallman (web2py intercepts /cached/ URLs, calls the original method if no value found in its cache)
18:03 mtholder sounds good.
18:04 mtholder (he says 17 minutes later)
18:04 mtholder left #opentreeoflife
19:04 jar286 setting up in apache requires converting GET to POST
19:04 jar286 in which case we could use apache caching
19:04 jar286 jimallman ^
19:05 jar286 I think on-demand would be enough
19:05 jar286 but opinions may very
19:06 jar286 precomputing them all is expensive (up front) and schedule-risking
19:06 jimallman i’m tinkering with the web2py option. so far, so good.
19:07 jar286 oh, just read your 12:45 comment. sounds good
19:07 jimallman it should be able to handle HTTP verbs, headers, etc. it’s not ideal, but maybe good enough.
19:08 jar286 but who does the GET to POST?  a script of yours?
19:08 jar286 or does it cache POSTs?
19:09 jimallman i’ll need to review the apache config. this is to let RESTful API use GET, but translate to POST for neo4j, yes?
19:09 jar286 yes
19:09 jar286 usually to get caching you have to use GET
19:09 jar286 apache doesn’t convert GET to POST and doesn’t cache POST
19:10 jar286 but I know nothing about web2py
19:10 jimallman ah, i see your point. web2py will accept any verb, so i guess it’s more a question of client-side (or intermediate) caching.
19:10 jimallman i think web2py will happily cache if i ask it to.  i’ll know soon.
22:31 codiferous joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary