Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2014-09-04

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:31 kcranstn joined #opentreeoflife
00:57 codiferous joined #opentreeoflife
01:13 towodo joined #opentreeoflife
04:50 mtholder joined #opentreeoflife
12:23 towodo joined #opentreeoflife
12:58 kcranstn joined #opentreeoflife
13:55 kcranstn joined #opentreeoflife
13:58 codiferous joined #opentreeoflife
14:35 towodo jimallman, looks like I never retired ot3. doing so now, ok?
14:44 jimallman ok by me. looks like this was only used in Atta
14:51 towodo ok, thanks
14:55 scrollback joined #opentreeoflife
14:59 kcranstn just going to grab coffee before the hangout
15:00 codiferous the hangout is at 1pm, right?
15:00 kcranstn aw, crap. Yes.
15:00 kcranstn coffee!
15:00 towodo must be pretty good coffee
15:00 kcranstn not really
15:01 kcranstn but it is warm and caffeinated
15:01 kcranstn and free and copious
15:01 towodo (if you want to take 2 hours to enjoy it)
15:01 codiferous well at least you have plenty of time to find better coffee
16:30 codiferous joined #opentreeoflife
17:01 kcranstn shall I start the hangout, or has someone else already done it?
17:01 towodo_ joined #opentreeoflife
17:01 codiferous go ahead
17:02 towodo_ am i invited?
17:03 kcranstn yes!
17:03 kcranstn invited cody, jim, joseph, mark
18:30 towodo joined #opentreeoflife
18:49 towodo jimallman, can I ask you about supporting files?
18:49 jimallman sure! ask away
18:49 jimallman i do not know if anyone’s running the backup script...
18:49 towodo I just ran it
18:50 jimallman we had talked about making a cron job, iirc
18:50 towodo oddly, there were no supporting files.
18:50 towodo so when someone enters a newick string, that string doesn’t become a supporting file - right?
18:51 towodo but that’s OK since you can reconstruct the newick string from the nexson, right?
18:52 towodo I found a bunch of supporting files on ot3, but they were jpegs, docx files, etc. - looks like tests
18:53 josephwb joined #opentreeoflife
18:53 jimallman i thought we were preserving the raw tree import as a supporting file, yes
18:53 jimallman those are likely the only “real” supporting files, unless someone’s being really diligent
18:54 towodo hmm.  I checked ot3, ot14, and ot16, and didn’t see any supporting files except the images etc. on ot3
18:54 jimallman here’s a search of phylesystem-1, but it’s probably going to hit every study that has been in the curation app:
18:54 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/search?utf8=%E2%9C%93&q=supporting_file_info&type=Code
18:54 towodo hmm… wonder if OTI could do that
18:55 josephwb phylografter has supporting files. don't know if they were ever copied over
18:55 josephwb e.g. matrices
18:55 josephwb is that what you mean?
18:55 jimallman interesting! that’s the kind of thing we’d want, yes.
18:56 jimallman here are three studies in phylesystem-1 with supporting file into:
18:56 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/blob/05d0c3afecdaffc39a06d5697b9ee2572c89a7f1/study/ot__2/ot_2/ot_2.json#L98
18:56 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/blob/1c858f24c0aa3a7b132c72e02dd5a99dbddc1bd7/study/ot_33/ot_33/ot_33.json#L466
18:56 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/blob/1c858f24c0aa3a7b132c72e02dd5a99dbddc1bd7/study/ot_33/ot_33/ot_33.json#L98
18:56 towodo that’s what I mean, but files from phylografter can be picked up later… right now I’m wondering if I’ve found all of the new ones for backup purposes, and if the curator app is functioning as desired
18:57 jimallman hm, in each case above, the “file” was actually a Newick string, so it’s not stored… this record is left so it will appear in the Files tab of the curation app, and associated with its tree. odd that i didn’t store the actual Newick. maybe (as you suggest) the idea was that it’s easy to reconstruct..?
18:58 towodo OK, good, then we haven’t lost anything…
18:58 towodo seems safer to keep the original newick as a supporting file - extracting from newick is fragile
18:58 towodo i’m thinking about dryad export
18:59 jimallman this seems to be a more accurate search, for “@filename”
18:59 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/search?utf8=%E2%9C%93&q=%22%40filename%22&type=Code
18:59 jimallman re: original newick, yes, and this is really the intent, to capture original (vs. edited) data
18:59 jimallman s/edited/curated
19:00 jimallman ah, there’s one study that expects to have a real supporting file:
19:01 jimallman https://github.com/OpenTreeOfLife/phylesystem-0/blob/4cba62d54bf1cf73f2b0319f512aba1b89bd4d0d/study/ot_29/ot_29/ot_29.json#L927
19:01 jimallman interesting, the same file appears in two annotations (possibly the same tree, imported twice?): https://github.com/OpenTreeOfLife/phylesystem-0/blob/4cba62d54bf1cf73f2b0319f512aba1b89bd4d0d/study/ot_29/ot_29/ot_29.json#L106
19:02 jimallman WAIT! my bad, that was a search of phylesystem-0 (test data)
19:02 towodo ot_29 was curated by josephwb on Aug 2
19:02 jimallman towodo: ^
19:02 towodo oh. whew
19:02 josephwb right. was playing with mapping
19:03 josephwb on dev
19:03 jimallman here’s the same search on phylesystem-1, the real stuff:
19:03 jimallman https://github.com/OpenTreeOfLife/phylesystem-1/search?utf8=%E2%9C%93&q=%22%40filename%22&type=Code
19:03 jimallman (it does seem to show some real files)
19:03 towodo you’re going to run that and give me the results?
19:03 jimallman i can dig for real files and send a list, yes. probably as a GitHub gist.
19:04 towodo ouch… I’m not sure where they would be… but we need to know
19:05 jimallman agreed. building the list now...
19:08 towodo we might have lost files from the first production round (‘Atta’)… that would have been prior to July 23… which I think is pre-launch
19:26 jimallman OK, I’m seeing a consistent (but weird) pattern in all the studies with real supporting files. Each one has two annotations listed, one with no URL and another with the upload URL, eg, /curator/default/to_nexson?output=input&uploadid=u1413b66b-ce04-4f5a-bdc6-0eb8874366c0
19:26 jimallman unfortunately, in the Files tab i’m using the wrong annotation so no download is even attempted. but in any case, i’m building a list of the upload URLs in all studies.
19:31 scrollback joined #opentreeoflife
19:50 kcranstn joined #opentreeoflife
20:06 jimallman towodo: here’s a list of all the supporting files (and their upload URLs) in phylesystem-1:  https://gist.github.com/jimallman/d45f127f4e7e1dfec36a
20:19 codiferous jim, can you confirm that you don't use any of the taxonomy GetJsons services?
20:19 codiferous jimallman ^^
20:19 jimallman just a sec, grep’ing the code again
20:19 codiferous haha, nice. supposed to just be a pointer at the previous line, but apparently i made a smiley
20:24 jimallman codiferous: getConflictTaxJsonAltRel and GetJsons/node both appear in the config files for the webapps, but i suspect they’re not actually used… one moment while i confirm
20:24 towodo ok, this isn’t good.
20:25 towodo I got to http://tree.opentreeoflife.org/curator/study/view/ot_130/?tab=files  and try to get the .tre file, and nothing happens.
20:25 towodo study uploaded on Aug 5…
20:25 towodo that was in the Bos epoch…
20:26 towodo that would have been ot14…
20:26 towodo which is still in operation…
20:26 towodo but the mirror script doesn’t find any files
20:28 jimallman codiferous: confirmed, these URLs are defined in config and in HTML pages, but never called
20:28 codiferous ok
20:28 jimallman towodo: looking into this now…
20:29 codiferous going to get rid of that class. those methods are waaay old and shouldn't be used...
20:29 jimallman towodo: i realize the links for each file (in the Files tab) are dead; i had hoped this was because they were pulling the @url from the wrong annotation, rather than that the uploaded files really don’t exist.
20:29 jimallman (you’ll see they have the default href=“#”)
20:29 towodo ok. so curator app UI is not definitive, that’s OK
20:30 towodo web2py/applications/curator/ has no supporting_files subdirectory
20:31 towodo that’s ok, I was just trying to use the gist info as a url… wrong idea...
20:32 towodo there’s a curator/uploads on ot14 but it’s empty
20:32 jimallman checking now… supporting_files/ is a web2py controller, not a directory name
20:37 jimallman adding a test file (image) to a study on devtree (let’s see where it goes):  http://devtree.opentreeoflife.org/curator/study/edit/pg_2584?tab=files
20:38 towodo find . -name "*204016f52bc0"
20:38 towodo ./repo/opentree/curator/private/scratch/2nexml/u424da336-13e8-4e7d-b2f8-204016f52bc0
20:38 towodo that’s a UUID randomly selected from your gist. the Newicks are there prepared for conversion to nexml. but the supporting file isn’t there.
20:39 jimallman found my test image (as expected) on ot16:~/repo/opentree/curator/uploads/supporting_files.doc.a14ed73265ec07fb.4b697474795269666c652e6a7067.jpg
20:40 towodo and the mirroring script picks it up just fine
20:40 jimallman it see that the tree-import URLs are not sensible for an upload file… they’re the calls to to_nexson that do the full import, eg, /curator/default/to_nexson?output=input&uploadid=u424da336-13e8-4e7d-b2f8-204016f52bc0
20:40 jimallman so those are never going to work. i’m obviously capturing the wrong URL for these
20:41 jimallman these are the lion’s share of the URLs in my gist. just a few exceptions (for non-tree uploads), starting with /curator/supporting_files/download
20:42 towodo I wonder if somehow the uploads got deleted in going from Bos to Cavia… but I thought we just updated the opentree repo, which shouldn’t have touched the uploads directory
20:43 * jimallman is checking to see if the /uploads directory is tracked by git… maybe it gets clobbered when we fetch the latest repo?
20:43 towodo in any case that doesn’t explain why Cavia has no uploads e.g. ot_208
20:43 towodo git pull doesn’t delete files IIUC
20:45 jimallman that makes sense, and curator/uploads/ isn’t tracked in the repo anyway.  i thought maybe we’re trying to be extra-careful and clobbering things in our deployment scripts..
20:45 towodo ot_208 was curated August 20, Cavia dates from Jul 27
20:46 towodo that’s very unlikely.
20:47 towodo uploads on ot14 was last modified on Jul 9.  If any update had deleted files therein, last mod date would be later.
20:48 jimallman understood. (also, i see that we clobber old web2py sessions/* but nothing else)
20:50 towodo my guess is that manually uploaded files are fine, but saving newick and nexml uploads doesn’t work
20:51 jimallman agreed. i’d suggest we try an upload on ot14 (production), if you don’t mind. just a standard binary first, to see if it can be downloaded and where it goes.. this is a test study:  http://tree.opentreeoflife.org/curator/study/view/pg_2823/?tab=files
20:51 jimallman (at this point, i just want to know that the simplest case is working)
20:51 towodo hmm… how does this help? the problem manifests on dev
20:52 towodo why not debug on dev?
20:52 jimallman i thought you were not finding a simple upload (image, etc)
20:52 jimallman if the bugs are restricted to tree imports/uploads, then yes, we can reproduce (and fix) on dev
20:53 towodo yes, it’s restricted to imports.  maybe this is a new feature?  although if it was never meant to work I don’t understand the gist
20:54 towodo that is, I believe it’s restricted to imports, and that we have not had any explicit (non-import) file uploads yet
20:54 towodo you just demonstrated that explicit uploads work
20:54 jimallman ok, i just thought we were missing the one or two previous cases (can find them in the gist)
20:54 towodo all the ones in the gist are missing, but they are all implicit uploads (imports)
20:55 jimallman ot_62 and ot_39 each seem to have had explicit uploads
20:56 jimallman they’re also pretty old, possibly from Atta
20:58 jimallman and yes, i need to start debugging on devtree (or my local) and sort out the problems with tree-import files (duplicate annotations, missing and bad URLs, not capturing the Newick string or uploaded files)
20:58 towodo ot_62 is from July 3… indeed… so that was an accidenal deletion (totally forgot to save uploads when deleting Atta)
20:58 jimallman apologies, this obviously never should have been sent to production in this state.
20:59 towodo there are few enough of the explicit uploads that we can ask the curators to re-upload… I’m not too worried about that
21:00 towodo and the imports are not *that* important… just want to make sure things are good going forward
21:01 jimallman agreed. the dead links are dumb in any case. i’ll remove the href=“#” in cases where we don’t have a proper @url.
21:01 jimallman so we get a text message instead of a link.
21:01 towodo sorry to interrupt you.  the gist is very helpful and puts my mind at ease.  if you want we can turn this into an issue and you can get back to your regularly scheduled program
21:01 jimallman absolutely, sounds great.
21:07 towodo oh… I see, *all* the files in the gist are explicit uploads, none are from imports.  so that’s worse than I thought. I will have to figure out what to do about this
21:08 towodo (thinking aloud) which brings back the question, why did your explicit upload on dev work,  but all these explicit uploads on prod fail?  It’s not just that Atta got deleted, that doesn’t account for all of them.
21:09 towodo so you are right that we need to test uploads on production.
21:17 jimallman towodo: no, your original assumption was correct. the
21:17 jimallman the “files” with to_nexson in the URL are tree-import data.
21:18 jimallman not Newick strings pasted into a field, but actual files uploaded for tree import.
21:19 towodo oh… but in that case why isn’t the sequence of study ids dense in the gist? e.g. no ot_142, 144, 145, 146, …
21:19 towodo oh I see
21:19 jimallman they’re normally grabbed by peyotl(?) for import, and probably saved (as you saw) in a scratch/ directory somewhere. i need to grab a copy as well for the uploads/ area
21:20 jimallman and build a proper download URL, as we see for the other upload types (images, etc)
21:20 jimallman so there are two forks in the road: whether a file is uploaded in the Files tab, or the Trees tab....
21:20 towodo ahh… the sequence isn’t dense, because many were pasted newicks
21:20 jimallman and whether (if in the Trees tab) the data is a true file upload or text pasted into a textarea.
21:21 jimallman right, i didn’t bother to include the pasted-newick studies in the gist (no point)
21:25 towodo ok.  so there are only two missing explicit-uploads, and they both expired when Atta did.
21:25 jimallman yes, it seems so.
21:26 towodo ok, so I can figure out who the curators were and grovel to get the files again… that was really my fault, I just totally forgot about it
21:27 jimallman you were right to hold open the GitHub issue… we just never followed up and automated this.
21:28 towodo I’m going to set up a cron job now… hey I did one backup to my laptop, I wonder if that has anything (will look)
21:31 codiferous i just submitted a pull request for the taxonomy/* services. tagged josephwb and towodo with a request to review
21:32 towodo tnx
21:33 * jimallman is off to run an errand, back within the hour
21:36 codiferous going to have to rename the old TNRS class, because the new name is the same, just different case
21:37 codiferous so will need to update apache redirects to reflect that so existing urls don't break. figured i'd just use "TNRS_v1" for the old one, and "tnrs" for the new one
21:37 codiferous acceptable, towodo?
21:37 codiferous towodo ^
21:38 towodo why does it have to be renamed?
21:38 towodo java is case sensitive
21:38 towodo does jetty do case folding maybe?...
21:39 codiferous eclipse doesn't like it, at the very least
21:39 codiferous eh
21:40 codiferous seems to be working anyway
21:40 codiferous so apparently, nevermind
21:40 towodo just asking because it’s one more thing for me to screw up
21:40 towodo it means the apache config and plugins would have to be updated in lockstep
21:40 towodo and I hate locksteps as you know…
21:41 codiferous yeah, i would prefer not to. at first it wouldn't let me, gave me the error "same name as X in different case"
21:41 codiferous but then i tried a different way, and it works. figured the error meant there was an exception to case-sensitivity for class names, but if it doesn't need to change, i won't change it
21:41 towodo if you’d like to do TNRS_v1 that’s fine, we can cope
21:43 codiferous hopefully no need
21:44 towodo jimallman, cron job set up now (on varela.csail.mit.edu).  I’ll recruit some other mirrors (e.g. Mark, Stephen)
21:50 kcranstn joined #opentreeoflife
23:02 codiferous joined #opentreeoflife

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary