Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2015-05-11

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:10 jar286 joined #opentreeoflife
00:19 codiferous joined #opentreeoflife
02:02 jar286 joined #opentreeoflife
10:03 mtholder joined #opentreeoflife
11:02 mtholder joined #opentreeoflife
12:51 jar286 joined #opentreeoflife
12:58 mtholder joined #opentreeoflife
14:50 blackrim joined #opentreeoflife
15:00 mtholder joined #opentreeoflife
15:13 codiferous joined #opentreeoflife
15:15 codiferous jimallman, jar286, any news on the oti issue?
15:15 jimallman no news here
15:20 mtholder left #opentreeoflife
15:55 mtholder joined #opentreeoflife
16:21 jar286 codiferous, I want to test it on devapi, and rebuild the index there, then we can merge the changes
16:21 jar286 but I was out most of the morning and now have a chore to do first
16:21 codiferous ok
16:21 codiferous i will be here on irc until 6pm or so, let me know if anything comes up
16:21 jar286 ok thanks
16:22 codiferous also available to talk about doc stuff if that is useful
17:37 josephwb joined #opentreeoflife
18:27 7GHAAENHX joined #opentreeoflife
18:33 jar286 testing out oti and ot-base changes on devapi now
18:42 josephwb good luck
18:43 jar286 failed.
18:44 jimallman details?
18:44 jar286 https://github.com/OpenTreeOfLife/oti/issues/38
18:45 jimallman gotcha, reading and testing now
18:46 jar286 set branches for ot-base and oti to new-synth-update
18:46 jar286 cody said he’d be around, maybe he’ll be able to fix it quickly
18:49 jimallman i’ll add an equivalent curl call to the issue comments
18:54 jimallman https://github.com/OpenTreeOfLife/oti/issues/38#issuecomment-101016191
19:14 jar286 jimallman, I’m going to force a rebuild of oti… I can see no reason why it would work, but then I see no reason we should get a missing class error, either
19:17 jar286 damn… it’s working now...
19:17 jar286 jimallman ^
19:20 mtholder joined #opentreeoflife
19:28 blackrim joined #opentreeoflife
19:30 * jimallman is reading now
19:31 jimallman hm, i’m not sure it’s working as expected..
19:32 jimallman my curl call (see link above) now returns a 200 response, but with a big stack trace in the middle of it
19:33 jar286 oh no…
19:34 jimallman posted to issue #38
19:34 jar286 actually I should have noticed that those calls were returning just a bit too quickly.
19:34 jar286 200 + error is a no-no
19:34 jar286 neo4j is pretty good about turning exceptions into 500s.  you just need to rethrow
19:35 jar286 after printing the stack trace
19:35 jimallman agreed.. if it’s any consolation, the new error is different: “java.lang.NullPointerException”
19:35 jar286 I don’t see how this worked in testing for cody but doesn’t on devapi… but I’ll let him figure it out
20:06 josephwb hi mtholder
20:07 mtholder hi
20:07 josephwb i was wondering if you could confirm my numbers in this file: https://drive.google.com/open?id=0B82Y0El5V8fXSXVSblgzUUthX0E&authuser=0
20:07 josephwb i can get you all of the inputs
20:09 mtholder um. sure. I guess we are just checking for an otcetera issue?
20:10 josephwb well, i did the intersection stuff in python, so: checking josephwb
20:10 josephwb [so many issues]
20:10 mtholder if so, them perhaps sending me part of your bash history w/ the invocations would also be the best route.
20:10 mtholder OK. I can certainly run them.
20:10 josephwb thanks
20:10 josephwb i think they are fine
20:28 jar286 it’s really stupid that google docs search doesn’t search inside of comments… makes it almost impossible to match email to the document (without loading the document, which takes forever)
20:34 jimallman It’s America’s favorite game show: “Data, or Metadata”?
20:36 codiferous joined #opentreeoflife
20:36 codiferous walked home early due to severe thunderstorm/tornado watch. looks like the oti stuff is working ok now?
20:37 jar286 not at all.
20:37 jimallman not so much. see the logs:  http://irclog.perlgeek.de/opentreeoflife/2015-05-11#i_10584898
20:37 josephwb ok, email sent mtholder
20:37 codiferous looking now
20:37 josephwb thanks again
20:38 codiferous joined #opentreeoflife
20:39 codiferous is the study url correct?
20:39 codiferous "http://devapi.opentreeoflife.org/phylesystem/v1/../ study pg_719"
20:40 josephwb codiferous: when you get a chance, can you please look over this: https://docs.google.com/document/d/1qq9VZccfPMG9Xic0wmp5BXMur98KrjXOY3-ZVuKzz1U/edit#heading=h.j5sffo3hvw8a
20:40 codiferous ok, thanks joseph
20:40 josephwb great, thanks
20:43 codiferous jar286, jimallman, this seems to work:
20:43 codiferous curl -vv -H "Content-Type: application/json"      --data '{"urls": ["http://devapi.opentreeoflife.org/v2/study/pg_719"]}'      http://devapi.pentreeoflife.org/v2/studies/index_studies
20:43 codiferous at least, doesn't produce an exception
20:43 * jimallman is trying this now...
20:44 jimallman yes, that looks like success (id appears in ‘indexed’ array)
20:45 jar286 if the url needs to be changed, that’s still an oti problem, just in a different place (the index_current_repo.py script)
20:45 jimallman interesting.. so the v2 API call works.. yes, that’s an easy change if it’s hard-coded in index_current_repo.py
20:45 jar286 but that we get a 200 when there’s an error is still a problem (a different one, admittedly)
20:45 codiferous i tried a very small number of queries on my local machine yesterday and the indexing seemed to be working
20:46 jimallman quick note to anyone tryign the curl call above.. change devapi.pentreeoflife to devapi.opentreeoflife
20:46 codiferous well, if you want it to return a 500, i can just tell the plugin to throw an exception
20:46 jar286 the index operation was proceeding as if there were no error - no messaages, nothing
20:46 jar286 that’s why I thought the problem was solved when it wasn’t
20:46 jar286 yes please, tell it to throw
20:46 codiferous yes, that was because we wanted it to try everything
20:47 codiferous but it can still try everything and then throw and exception
20:47 jar286 not sure what you mean ‘try everything’
20:47 codiferous of course, that reduces the level of detail in the response
20:47 codiferous the entire list of urls provided
20:47 jar286 you mean, it’s not an error if some study ids don’t have corresponding studies?
20:47 codiferous originally there was a method to do one url at a time but for some reason somebody wanted that changed
20:48 codiferous it doesn't return a 500, no
20:48 jar286 I thought it was doing one url at a time. that’s what I see on the console
20:48 codiferous it returns a bunch of info in the response about the errors it encounters
20:49 jar286 error -> 500.  not error -> 200.
20:49 jar286 if there’s some situation that involves not indexing that’s not an error, that can be a 200, but then we have to check the response body
20:49 codiferous if it does one at a time then we should perhaps be using a service that has that level of granularity
20:50 codiferous that simplifies the question of what to do about the response
20:50 jar286 ‘a bunch of info’ can go in a 500 response
20:50 codiferous but not through neo4j
20:50 codiferous it just returns it's own json response with the stack trace associated with the exception
20:50 jar286 the python script gets a list of studies, then indexes them one at a time. i don’t see anything wrong with that, it’s worked for a long time
20:51 codiferous i could load all the other data into the 'message' property but that is clunky
20:51 jar286 but if it’s an error it must not be a 200. it can be a 500, that’s fine, with lots of info
20:51 codiferous all the info about errors would have to go into a single string
20:52 jimallman ah, the v1 study works too, it was just garbled in my curl example (fixing this now)...
20:52 codiferous not json, but a string within the json
20:52 jimallman (v1 study URL, that is)
20:53 jar286 when there’s an exception what does neo4j put in the body? stack trace? if so, just put all the info you care about in the exception’s toString()
20:53 jar286 in the 500 body that is
20:54 jimallman fyi - i’ve updated my earlier comment to show the working study URL:  https://github.com/OpenTreeOfLife/oti/issues/38#issuecomment-101016191
20:54 codiferous yes, that is what i mean. that toString() method would contain potentially many stack traces, exception class names, and the contents of the each individual exception's toString() message
20:55 jar286 doesn’t neo4j show the stack trace in a 500 response
20:55 codiferous presumably encoded in json again to keep it all straight
20:55 codiferous just for the exception that is thrown
20:55 jar286 oh jeez
20:55 codiferous but the point of accepting multiple urls is to try them all
20:55 jar286 piece of $#@
20:56 jar286 you’re saying it’s a non-error for one of the index operations to be an error?
20:56 jar286 that’s crazy talk
20:56 codiferous we could change the behavior so it throws the first exception it encounters
20:56 codiferous well right now it just collects the errors and returns them when it's attempted every incoming url
20:56 jar286 if there’s some ordinary situation where behavior should be defined, we just make that not be an error
20:57 jar286 but the method is never used with more than one url
20:57 jimallman i like the collect-and-return method.. can’t we just raise an explicit exception with the full current payload?
20:57 codiferous yes
20:57 codiferous but
20:58 codiferous if we do, then we have to double-encode all the "current payload" into a json string, that will be buried inside a property called "message" or something
20:58 jar286 if there’s a bug in the server code, then there should be a 500 regardless
20:58 jar286 if the failure is some kind of condition you’d expect from certain phylesystem states, that’s not a bug in the server code
20:59 jar286 the two cases have to be distinguished
20:59 codiferous that sounds like a different issue
20:59 jimallman i assume the state-based failure would map to some kind of 4xx response, yes?
21:00 jar286 I’m just saying if you use the HTTP protocol, you should use it as intended
21:00 jar286 yes, 4xx
21:00 jar286 i’m not saying this has to be done now, i’m just saying we should approximate as best we can
21:00 jimallman http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_Error
21:00 codiferous what is better: 2 simple options
21:01 jar286 since we’re not using the multi-url form, I don’t see why not just do a 500 on server error. write the detailed error information to the neo4j log
21:01 codiferous 1. just use a service that indexes one study at a time. allow neo4j to return its native error format (includes a stack trace and exception name) if it encounters one
21:01 jar286 and then we can have an issue to tease apart the 4xx and 5xx cases later
21:02 jar286 umm, my guess is that neo4j uses printStackTrace, and that does toString on the exception.  it’s not just the exception name
21:02 codiferous 2. change the behavior of the current service to throw a dummy exception *after* collecting some arbitrary number of errors, and double-encode the errors into a json string inside the dummy exception
21:02 codiferous ... the dummy exception's error message
21:03 codiferous both of those will return a 500
21:03 codiferous i prefer option 1
21:03 codiferous cleaner, simpler java code, and easier to get useful info from the response
21:03 jimallman i have no strong feelings. (i was concerned about API consumers, but it appears we don’t really document these methods)
21:04 jar286 for now this is a very simple change. just change the existing method to throw an exception when there is one.  the service is never used with more than one url.  if you want to make a new service, that’s fine, and marginally cleaner, it just seems like unnecessary delay
21:04 codiferous nobody should be using the indexstudy method on the public servers unless they're working directly with us, i would imagine
21:04 jimallman agreed
21:04 codiferous sure, that is fine
21:05 jar286 correct
21:05 jar286 the only thing that’s documented is the ‘index’ command in the deployment system, which internally invokes that python script
21:07 jar286 regarding the current problem, should I retry indexing now? or does index_*.py need to be changed?
21:08 jimallman after changing my curl call to match the proper study URL, it seems to me that your indexing should work now.
21:09 jar286 so the .py script has to be changed.
21:10 jimallman by which i mean, indexing via cURL is looking good in my latest tests:  curl -vv -H "Content-Type: application/json" \
21:10 jimallman --data '{"urls": ["http://devapi.opentreeoflife.org/phylesystem/v1/../default/v1/study/pg_719"]}' \
21:10 jimallman http://devapi.opentreeoflife.org/v2/studies/index_studies
21:10 jimallman gah, what a mess
21:11 jimallman my working test suggests that codiferous has fixed something important, and that it’s already running on devapi.
21:12 jar286 nobody has pushed anything to devapi
21:12 jimallman the URLs in your original report look right to me (and the study URL returns full Nexson)
21:14 jar286 so what has changed?
21:14 jimallman my curl test was returning errors because the study URL was broken (obviously so, in retrospect). now that i’m using a proper study URL, it’s working for me. so there must be a subtle difference between the failing call and my test.
21:14 jar286 was it working before and we thought it wasn’t?
21:14 jar286 now I want the 500 fix, so that I don’t have to change the indexing script to look at the response body
21:15 jimallman the failing call dumps lots of critical information, and the URLs in particular look right to me. i’m referring to your report here: https://github.com/OpenTreeOfLife/oti/issues/38#issuecomment-101013368
21:16 codiferous new service: index_study
21:16 codiferous curl -X POST http://127.0.0.1:7474/db/data/ext/studies/graphdb/index_study/ -H 'content-type:application/json' -d '{"url":"http://devapi.opentreeoflife.org/phylesystem/v1/../default/v1/study/pg_719.json"}'
21:16 jar286 this is confusing
21:16 codiferous will just throw any exception through native neo4j procedure
21:17 codiferous the multiple-url method will also now throw the first exception it encounters. I have added notes indicating that method is deprecated
21:17 jar286 still in the same branch?
21:18 codiferous new-synth-update
21:18 codiferous on that pr
21:18 jar286 ok, I’ll try it
21:26 jar286 jimallman, you speak of ‘critical information’, but is that anything other than the exception description and stack trace?
21:26 jimallman i just meant the URLs, which make it pretty easy to reconstruct the call using (for example) cURL
21:29 jar286 hmm. the indexing script prints the URL
21:29 jar286 it’s taking forever to start the oti neo4j instance
21:35 jar286 now why would there be a python process running on ot10?… maybe i’ll kill it
21:37 jar286 well one of them (at least) is web2py… maybe all of them
21:37 jimallman weird… i was going to guess a cron job, or long-running indexing operation recently?
21:37 jar286 no, it’s apache.  the pids are consecutive
21:37 jar286 apache/web2py that is
21:38 jimallman right. sounds harmless, unless you’ve tried to stop the webserver
21:38 jar286 I never know whether a SIGINT in the ssh client is going to SIGINT on the server side
21:38 jar286 I’m going to leave the pythons alone
21:38 jar286 oti is up now i think
21:39 jar286 ok. now i’m going to try indexing.
21:39 jar286 getting 500 Server Error: Internal Server Error
21:40 jar286 now I guess I have to modify the .py script to show me the body
21:41 codiferous hm
21:41 jimallman is it our old friend pg_719?
21:41 jar286 Indexing http://devapi.opentreeoflife.org/phylesystem/v1/../ study pg_719 from http://devapi.opentreeoflife.org/phylesystem/v1/../default/v1/study/pg_719.json
21:41 jar286 Calling "http://127.0.0.1:7478/db/data/ext/studies/graphdb/index_studies" with data="{'urls': ['http://devapi.opentreeoflife.org/phylesystem/v1/../default/v1/study/pg_719.json']}"
21:41 jar286 Indexing failed for http://devapi.opentreeoflife.org/phylesystem/v1/../default/v1/study/pg_719.json
21:41 jar286 500 Server Error: Internal Server Error
21:42 jimallman my original curl call for this returns the traditional “NoClassDefFoundError”, which is annoying
21:43 jar286 oh.  let me try deploying again.
21:43 codiferous it seems to be working for me
21:43 codiferous (locally)
21:44 jar286 installed curl on ot10
21:45 jar286 yep, no class def
21:46 jimallman codiferous: when you say “locally”, do you mean you’re testing on ot10 via ssh, or on your local test box?
21:48 codiferous my local machine
21:49 jimallman that makes sense, thanks
21:51 jar286 AttributeError: 'list' object has no attribute 'items'
21:52 codiferous hm, thats interesting
21:52 jar286 codiferous, did you update the plugin incompatibly with the python script?
21:52 codiferous perhaps, though i tried not to. lets see
21:52 jar286 it’s looking for an errors key in the 200 response body
21:53 jimallman maybe from https://github.com/OpenTreeOfLife/oti/blob/d21d70d8927447f568f86ec0be84ea7aff8b6912/index_current_repo.py#L81
21:53 codiferous yes, there is no longer any need for that object
21:53 jimallman looks like index_studies was returning a dict of errors, now a list
21:54 codiferous i had it returning an empty list
21:54 codiferous but it should have been an empty map
21:54 jimallman gotcha
21:54 codiferous just pushed a change
21:54 codiferous it would be better to use the index_study method
21:56 jar286 looking now at how to best get the 500 response body
21:56 jar286 you can change it if you like
21:58 jimallman jar286: will you get enough information from this (HTTPError.message)?  https://github.com/OpenTreeOfLife/oti/blob/d21d70d8927447f568f86ec0be84ea7aff8b6912/index_current_repo.py#L79
21:58 jar286 no. I don’t see an advertized method for getting the response body out of the exception.  need to avoid doing r.raise_for_status().
21:59 jar286 or delay it at least. or save r in a place where it can be examined after the exception is raised.
21:59 jar286 annoying, but requests is very simpleminded
21:59 jimallman looks like the original thinking was “# We don't really need any data out of r.”
21:59 jar286 (sort of like neo4j in ways)
22:00 jar286 at least I can read the source code for requests and it makes sense.
22:02 jar286 codiferous, are you editing index_current_repo.py ?  I can spruce it up if you’re not
22:03 codiferous http://docs.python-requests.org/en/latest/user/quickstart/#response-content
22:03 codiferous i am not
22:04 jimallman codiferous: i was just browsing the same page (jinx)
22:04 jar286 yes, I know how to get the content of a response.  the problem is that the response object has been thrown away by the point in the code where it’s needed
22:04 jar286 not a big deal, i’ll take care of it
22:05 jar286 so you have committed everything you want to the new-synth-whatever branch? I can commit to it?
22:05 jimallman some notions for how to check response more carefully before raise_for_status:  http://stackoverflow.com/questions/24237185/python-requests-library-exception-handling
22:05 * jimallman returns to his knitting..
22:05 jar286 yes, it’s not a problem, thanks
22:05 codiferous yes, i do not have any outstanding changes
22:11 jar286 codiferous, one last thing, can you send me a curl example of using the new service?
22:11 jar286 (or paste into irc)
22:14 codiferous curl -X POST http://127.0.0.1:7474/db/data/ext/studies/graphdb/index_study -H 'content-type:application/json' -d '{"url": "http://devapi.opentree.org/phylesystem/v1/../default/v1/study/pg_719.json"}'
22:14 jar286 ok gotta go, thanks
22:14 codiferous what you might expect. it will either return 'true' on success, or return the neo4j error response if it encounters an exception
23:12 codiferous joined #opentreeoflife
23:14 codiferous joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary