Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2014-09-09

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:30 josephwb joined #opentreeoflife
01:13 josephwb joined #opentreeoflife
03:13 josephwb joined #opentreeoflife
03:32 jimallman_ joined #opentreeoflife
05:32 mtholder joined #opentreeoflife
05:58 ilbot3 joined #opentreeoflife
05:58 Topic for #opentreeoflife is now Open Tree Of Life | opentreeoflife.org | github.com/opentreeoflife | http://irclog.perlgeek.de/opentreeoflife/today
07:15 mtholder joined #opentreeoflife
09:49 mtholder joined #opentreeoflife
11:09 mtholder joined #opentreeoflife
11:35 josephwb joined #opentreeoflife
12:16 towodo joined #opentreeoflife
12:39 kcranstn joined #opentreeoflife
12:44 kcranstn joined #opentreeoflife
13:16 kcranstn joined #opentreeoflife
13:38 josephwb joined #opentreeoflife
13:54 mtholder jimallman, would you mind if I slightly change the behavior of def _fetch_duplicate_study_ids in phylesystem-api/controllers/default ?
13:54 mtholder currently it throws an exception if OTI is not reachable
13:54 mtholder I'd prefer it to return None for the duplicateStudyIds
13:55 jimallman ok by me (though it’s technically not a correct response, right?)
13:55 mtholder the caller of the GET on a study will be able to tell that the call failed, because (I think) that you get an empty list from OTI
13:55 mtholder if there are no dups
13:55 jimallman i see, None vs. [ ]
13:56 mtholder OTI should always be online, so maybe I'm being too anxious. Just seems like phylesystem should function even if there is a timeout in the oti call.
13:56 jimallman yeah, we have a lot of these kinds of issues, where the different services are entangled and mutually dependent.
13:57 jimallman i’m not quite sure how to gently decouple things. maybe this is the way.
13:57 mtholder yes (wrt to your None vs [] comment).
13:57 mtholder the immediate reason I was tempted to do this was because my tests don't always fully configure phylesystem-api (no oti stuff).
13:58 mtholder I can add the oti config, but I kinda like testing with it all decoupled when I can get away with it.
13:58 jimallman yes, testing is a good use case for this kind of isolation.
13:59 mtholder alternatively, I could make the response omit the 'duplicateStudyIDs' property if that call did not succeed. Not sure which is more appealing for the client to deal with (None or absent key)
14:00 jimallman we’ve also talked about making it easier to spin up test servers (and private systems) without all the prior work of GitHub registration, etc. i’d like to set up a huddle on this sometime.
14:01 jimallman re: omit the ‘duplicateStudyIDs’ if not available… that seems reasonable, and no worse than the unexpected None (client will need to handle the missing information either way).
14:02 jimallman omitting supplemental data (vs. an explicit error message) is an interesting general pattern for these situations. it (correctly) implies a “diminished” system without full capabilities, and the logic to handle this would communicate that as well.
14:04 josephwb jimallman: are all of the commits pushed now?
14:04 jimallman in phylesystem-1? yes, the production system should reflect all the work on old and new production sites.
14:05 jimallman let me know if you see discrepancies and i’ll go fish...
14:05 josephwb when did all of this get pushed?
14:06 josephwb i ask because another dropped commit broke synthesis
14:06 josephwb the Eukaryote fix
14:08 jimallman the missing commits from old production were merged about 22 hrs ago:  https://github.com/OpenTreeOfLife/phylesystem-1/commit/0c64e77ae4fcdc1492d0549840cd6ec095fc4ca7
14:08 jimallman and pulled to new production shortly after (this should not have clobbered anything, but i can check)
14:08 josephwb ok, thanks. i started synthesis ~30 hours ago
14:09 jimallman tell me more about the eukaryote fix. was this on old production? do you have the commit SHA?
14:09 josephwb let me look. basically, it was an incorrectly rooted tree. it looks correct now on the curator.
14:11 jimallman ok. since synthesis pulls studies from the production API (correct?), we probably should have restarted it after merging the work from old production. just to align the results with curators’ expectations.
14:12 josephwb corrcet
14:12 jimallman this is way outside of my areas of expertise, but how is synthesis breaking exactly?
14:12 * jimallman is trying to decide how worried to be about this…
14:12 josephwb by having a tree improperly rooted, it messes up relationships
14:13 jimallman ok, so not invalid nexson or studies that fail to load…
14:13 josephwb in this instance, it bypasses the Eukaryote node
14:13 josephwb right, yes
14:13 jimallman gotcha. yeah, that seems like it just ingested old/wrong nexson data
14:13 josephwb just like my other study: i fixed it, downloaded it, it worked, downloaded it again, and it failed because the fix was missing
14:15 jimallman gotcha. new production (and its phylesystem API) *should* be up to date now, with no more surprises.
14:17 josephwb i am afraid the stephen's head is going to explode
14:18 jimallman he should come to #opentreeoflife for answers and reassurance
14:26 kcranstn joined #opentreeoflife
14:29 josephwb jimallman: can you succinctly explain what you meant by "local repo"? i don't understand why i was able to download updated studies at one point but not at a later point.
14:32 jimallman the API server does a *lot* of interaction with the nexson docstore (a git repo). we long ago decided that it needs its own local repo for this, with frequent sync’ing to the “ultimate” docstore on GitHub.
14:32 jimallman so by “local repo”, i mean one that exists on api.opentreeoflife.org and is completely controlled by the phylesystem-api server.
14:33 jimallman it uses push and pull (just like a human git user) to stay in sync with the repo on GitHub.
14:34 jimallman we’ve been improving the API server to reduce the “lag” time between new work being submitted by a user, and it showing up on GitHub (and in oti’s study index)
14:34 josephwb ok, so when i upload/fix a study, it is in the api repo?
14:35 jimallman right, and should be pushed to the “real” repo on github Real Soon Now (within seconds)
14:35 josephwb we pull all of our studies from the api
14:35 jimallman this triggers a refresh of the index in oti, so the new work will show up in the curation app’s study list Real Soon but not instantly.
14:36 jimallman right, pulling studies form the api means they’re actually coming from the API server’s “local” repo.
14:37 towodo josephwb, the normal state of affairs is that changes get pushed from the api repo to github immediately.  but on Aug 2 the system got “stuck” and changes got queued up.  when we moved producction to a different server, it was initialized from github and didn’t have changes from Aug 2 on.  so Jim pushed those out to github manually, and then loaded them onto the new production server.  for 3 days the new
14:37 towodo had the wrong versions.
14:37 josephwb how long do things stay in that repo?
14:37 jimallman the reason we “lost work” is that the new production system had a new API server, and its local repo was populated from the one on GitHub (which was missing some recent commits)
14:37 towodo i think i just answered that?
14:38 towodo I believe we’re back to the ‘normal state of affairs’ where changes get pushed out quickly
14:40 towodo jimallman, i was going to answer mark
14:40 josephwb still a bit hazy: when i pull a study, it comes from the "local" api repo, not phylesystem?
14:41 towodo correct, but since the two have the same information it doesn’t matter
14:41 towodo except for the recent lossage.
14:41 josephwb they both contains all the same data?
14:41 towodo yes, all the time, except when there’s a bug like we saw from aug 2 to sep 5
14:42 josephwb ok, dumb question: why? why do we have duplicate repos?
14:42 josephwb safety?
14:42 towodo performance
14:42 towodo or rather latency
14:43 josephwb performance of the curator?
14:43 mtholder right latency. We don't have to make every phylesystem-api transaction wait for github
14:43 towodo latency.  so that when you click to edit a study the api doesn’t have to do a ‘git pull’ which is slow
14:44 josephwb oh, okay.
14:44 mtholder right now I'm working on making the phylesystem-api (or rather a crontab) scream at me if we have another situation of the push to github getting stuck.
14:44 josephwb scream in German, right?
14:45 kcranstn absolutely
14:46 josephwb Québécois is also acceptable
14:46 kcranstn merde!
14:47 jimallman :)
14:48 mtholder i have enough people screaming at me in german these days.
14:49 jimallman are you asking for ice in your beer?
14:49 mtholder lol.
14:49 mtholder some ice (not in beer) would be nice.
14:49 mtholder actually everyone has been extremely nice to us.
14:49 mtholder thus far.
14:50 jimallman thus far.
14:50 * jimallman hints darkly
14:51 josephwb jimallman: so, just to finish up. when I fixed the study, it was on the "local" repo.
14:51 josephwb that didn't get pushed to phylesystem?
14:51 josephwb but it was still on the local repo, no?
14:52 towodo mtholder, just answered your email.
14:52 jimallman the *old* local repo, on the old production server.
14:52 josephwb ah, got it.
14:52 josephwb thanks
14:52 jimallman kewl
14:54 josephwb the local repo doesn't store things in shards?
14:55 mtholder thanks, towodo. that helps alot. The system is still to brittle. Adding a new line to the README on GitHub is enough to block future pushes.
14:55 mtholder josephwb, yes it does
14:55 towodo no, I don’t think the new line was it.  that happened way after the divergence started.
14:55 mtholder but there is only 1 shard
14:56 jimallman we could tweak the README again and see what happens...
14:56 mtholder I know the newline wasn't it. I'm just saying that would be enough to make the system constipated.
14:56 towodo oh.  i see
14:56 mtholder It won't cause a conflict, but it will make the next push fail.
14:56 towodo ahh… right.
14:57 mtholder I can add a pull before the push, but I was hoping for emily and I to have the better diffing deployed before we have automated pulling.
14:57 jimallman i thought we had a “reflexive” pull for situations like that.. or does that require a working semantic merge?
14:57 towodo you want to do a git fetch, and a merge, not git pull
14:57 towodo i think
14:57 * jimallman sees the answer before my question
14:58 mtholder yes. we do fetch then merge. I was being sloppy.
14:58 mtholder but we don't ever pull from GH (yet).
14:58 jimallman towodo: in this case, i think a proper pull (fetch+merge) is required to enable fast-forward
14:58 towodo really? … the merge won’t do a ff ?
14:59 towodo I have read advice that says *never* do git pull, always do git fetch followed by git merge FETCH_HEAD
14:59 josephwb can i confidently pull studies now, or are you guys doing some merging?
14:59 jimallman in git terms, pull = fetch + merge    … so yes.
14:59 jimallman i guess i should have said “to enable push”
15:00 towodo not sure i get that
15:00 jimallman fetch + merge (vs. pull) allows a close comparison before the merge. i think it’s mainly recommended to avoid having lots of conflicts dumped in your lap. ‘git pull’ is kind of optimistic that way.
15:01 josephwb jimallman: can i confidently pull studies now, or are you guys doing some merging?
15:01 jimallman josephwb: you should be able to pull now, yes.
15:01 josephwb great
15:02 jimallman towodo: i’m probably being imprecise here. when i encounter this kind of block to a git push, the message from git says something like “you can’t push right now, because there’s new stuff upstream. pull first, then push”
15:03 travis-ci joined #opentreeoflife
15:03 travis-ci [travis-ci] OpenTreeOfLife/phylesystem-api#593 (unblocking-push - f3e9076 : Mark T. Holder): The build passed.
15:03 travis-ci [travis-ci] Change view : https://github.com/OpenTreeOfLife/phylesystem-api/compare/d8ff34e7b29e^...f3e907690945
15:03 travis-ci [travis-ci] Build details : http://travis-ci.org/OpenTreeOfLife/phylesystem-api/builds/34822428
15:03 travis-ci left #opentreeoflife
15:04 mtholder I think that "git pull remote branch" is just an alias for "git fetch origin; git merge branch origin/branch"
15:06 mtholder it is nice in the phylesystem-api to break up the fetch and the merge so that we can handle errors in each operation differently, rather than using the alias.
15:08 towodo I’m still wondering about what you said about “enable ff”.  if I git fetch, then commits from github will show up in the local repo. that sounds like a ff of sorts. then if I git merge, and there are no conflicts, then git push, then the local commits will show up in github. that also sounds like a kind of ff to me.
15:09 mtholder joined #opentreeoflife
15:12 mtholder I think that the -ff option in the merge of git  is all about about whether a new commit is created for each merge. If ff is enabled, and the branch that is the source of changes is a descendant of the current branch's HEAD, then no new commit will be made.
15:12 mtholder with ff, it just advances the branch's pointer.
15:13 jimallman mtholder: thanks, i think you’ve got it.
15:13 mtholder with ff disabled, you see the commit operations.
15:13 mtholder sorry merge ops.
15:14 mtholder you see each merge as a commit, even when the change set is just fast forwarding one branch to another.
15:14 towodo ah.
15:14 mtholder I've got to run... I'll send in a pull request on the screaming on blocked pushes branch tonight (hopefully).
15:15 towodo yes i’ve seen those no-op merge commits. i’ve never understood the point.
15:15 towodo great
15:16 jimallman mtholder: thanks!
15:18 jimallman towodo: i think the benefit of non-FF merges is that the history is a little easier to follow (if you want to see when a feature branch was merge, for example)
15:19 towodo I suppose.  seems like the branch history should be orthogonal - deciding to keep this kind of info doesn’t seem to belong in the main commit stream…
15:19 jimallman towodo: in short, my initial mention of “fast forward” above was misleading and probably Just Plain Wrong
15:19 towodo ok
15:20 jimallman i think GitHub defaults to non-FF merges so we can do things like recreate a deleted branch (which is occasionally awesome)
15:48 josephwb jimallman: this is the call we sue to pull studies:
15:48 josephwb call = "http://api.opentreeoflife.org/phylesystem/v1/study/" + studyid
15:49 jimallman ok… looks reasonable
15:49 kcranstn in v2, changes to api.opentreeoflife.org/v2/study/studyid
15:49 josephwb does that pull from phylesystem? or the local repo?
15:49 kcranstn phylesystem
15:50 jimallman actually, this pulls from the API server’s local repo, doesn’t it?
15:50 josephwb ah!
15:50 towodo it pulls from the api’s clone of the phylesystem repo
15:50 kcranstn sorry
15:50 josephwb the local repo, then
15:50 jimallman in practical terms, they are (as towodo points out) clones so there should be no difference
15:51 josephwb stephen and i were just confused about this
15:52 jimallman basically, any phylesystem API calls are operating on its local repo
15:52 josephwb so, i guess i have another stupid question: if all calls to the api use the "local" repo, what is phylesystem for?
15:52 kcranstn public repository. safety. github credentials.
15:53 jimallman if by phylesystem, you mean the nexson docstore on github… what kcranstn said.
15:53 jimallman also avaialble for other tools besides our API, including easy cloning to a local copy for analysis
15:53 josephwb right, ok
15:57 towodo the biggest thing about writing through to github is community trust & perception. it makes the data and its management totally transparent. no worries about hidden state, servers going down, mismanagement, abandonment, etc.
15:58 mtholder joined #opentreeoflife
15:59 mtholder joined #opentreeoflife
16:08 towodo time to eat lunch and vote!
17:05 mtholder joined #opentreeoflife
17:07 josephwb joined #opentreeoflife
17:25 josephwb joined #opentreeoflife
17:47 josephwb joined #opentreeoflife
17:51 mtholder left #opentreeoflife
18:14 guest|33107 joined #opentreeoflife
18:17 mtholder joined #opentreeoflife
18:34 kcranstn joined #opentreeoflife
19:25 josephwb joined #opentreeoflife
19:51 mtholder joined #opentreeoflife
20:01 mtholder left #opentreeoflife
20:51 josephwb joined #opentreeoflife
22:25 josephwb joined #opentreeoflife
22:32 josephwb joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary