Perl 6 - the future is here, just unevenly distributed

IRC log for #opentreeoflife, 2014-08-14

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:12 kcranstn joined #opentreeoflife
00:28 josephwb joined #opentreeoflife
02:03 josephwb joined #opentreeoflife
03:12 josephwb joined #opentreeoflife
03:56 ilbot3 joined #opentreeoflife
03:56 Topic for #opentreeoflife is now Open Tree Of Life | opentreeoflife.org | github.com/opentreeoflife | http://irclog.perlgeek.de/opentreeoflife/today
04:11 ilbot3 joined #opentreeoflife
04:11 Topic for #opentreeoflife is now Open Tree Of Life | opentreeoflife.org | github.com/opentreeoflife | http://irclog.perlgeek.de/opentreeoflife/today
05:46 jimallman joined #opentreeoflife
09:25 scrollback joined #opentreeoflife
11:57 josephwb joined #opentreeoflife
11:58 josephwb joined #opentreeoflife
12:08 josephwb joined #opentreeoflife
13:11 kcranstn joined #opentreeoflife
14:16 josephwb joined #opentreeoflife
14:23 kcranstn joined #opentreeoflife
14:47 towodo joined #opentreeoflife
15:27 josephwb towodo: just about done with function to build DB directly from ott distribution.
15:27 josephwb will version string be consistent?
15:27 josephwb e.g. "2.8draft5". not "ott2.8draft5"
15:28 josephwb my point: i am appending the "ott" prefix, unless you do it on your end
16:31 towodo joined #opentreeoflife
16:33 towodo josephwb, the version string will be consistent.  the ‘draftN’  suffix may or may not be present
16:40 josephwb ok, thanks
17:01 josephwb FYI latest synthesis took 23.5 hours to run. :-O
17:01 kcranstn yikes
17:01 kcranstn that seems crazy
17:02 josephwb many more trees than before, and big ones
17:05 josephwb but, yeah: crazy
17:06 kcranstn probably want to think about performance on an upcoming call (perhaps post-hackathon?)
17:13 josephwb jimallman: should we put this new tree/DB up?
17:14 josephwb still not perfect, but good for people to peruse
17:14 kcranstn sounds good to me. How many input trees?
17:14 josephwb hmm, i'll check. a lot.
17:15 jimallman are we talking pushing this to devtree, or production?
17:15 jimallman (if the latter, we’ll need to be sure the code and db versions are compatible)
17:16 kcranstn dev first and invite internal review?
17:18 josephwb +1
17:18 josephwb for dev
17:19 josephwb jimallman: how would i get this to you/the server?
17:20 jimallman scp to devapi (ot10) would be best, something like:
17:20 jimallman $ scp mystuff/treemachine.db.tgz ot10:downloads
17:21 jimallman (assumes you have the private key for user ‘opentree’, and ot10 defined in your .ssh/config. i can provide snippets to do this)
17:21 josephwb kcranstn: 480 trees
17:22 josephwb microbes are still light. i just went with the ones that worked to get something together
17:22 josephwb jimallman: i think i have the key (remember the downtime), unless things have been updated
17:23 kcranstn 480! nice
17:23 jimallman https://gist.github.com/jimallman/d77ae7682e0273ceec95
17:24 jimallman josephwb: ^^ that’s what you want in ~/.ssh/config (if it’s not there already)
17:26 josephwb DB if 27 GB before compressing
17:36 josephwb jimallman: i do have that in my config
17:36 josephwb *still waiting on compression*
17:37 jimallman ok. while you’re waiting, maybe take a look at the more detailed feedback for missing/isolated taxa in synth tree?
17:37 jimallman http://devtree.opentreeoflife.org/opentree/argus/ottol@5554385/Drosophila-genus-in-family-Drosophilidae-
17:38 jimallman that’s a lotta flags. not sure they’re all useful.
17:38 * jimallman is going to make a sandwich, brb
17:42 josephwb jimallman: there are 2 reasons why a taxon might not be in the synthetic tree 1) it will filtered out (i.e. it is not in the DB at all; this should be explained by the flags), or 2) it is not monophyletic (i.e. it is in the DB, but the synthetic tree does not pass through it)
17:47 josephwb *copying over 16.2 GB compressed DB now*
17:52 josephwb looks like we have a ~30 minute wait still
18:20 josephwb1 joined #opentreeoflife
18:20 josephwb joined #opentreeoflife
18:21 josephwb jimallman: DB has been copied over to ot10
18:22 jimallman thanks, i’ll install it and re-index oti
18:26 jimallman left #opentreeoflife
18:26 jimallman joined #opentreeoflife
18:29 jimallman josephwb: am i correct in assuming the file we want is Life_v2.8draft5_mapcompat_13August2014.db.tgz
18:29 jimallman ?
18:51 towodo joined #opentreeoflife
18:55 towodo there is a script for installing the .db.tgz file, but it assumes it’s in a particular form… ok to do it manually
18:56 towodo https://github.com/OpenTreeOfLife/opentree/blob/master/deploy/README.md#installing-a-new-database-version
19:08 jimallman towodo: i’m using the ‘install-db’ script, which lets you specify the path to the tgz file.
19:08 towodo yes, but it’s the paths internally in the .tgz that are at issue.
19:08 towodo if the .tgz hasn’t been created in the prescribed way then the db has to be installed manually.
19:08 jimallman oh! good to know
19:09 towodo check the path using tar tzf downloads/treemachine.db.tgz | head
19:11 towodo the file needs to look as if it has been created per https://github.com/OpenTreeOfLife/opentree/blob/master/deploy/README.md#how-to-push-the-neo4j-databases
19:14 josephwb joined #opentreeoflife
19:15 jimallman here’s what i’m seeing (probably not what we want): https://gist.github.com/jimallman/74d1d2051a34ed629cac
19:15 jimallman towodo: ^
19:15 towodo nope. manual install needed
19:17 towodo neo4j expects the database to be in data/graph.db
19:19 towodo so (1) stop neo4j, (2) extract the files into a Life_v2.8draft5_mapcompat_13August2014.db directory somewhere, (3) move existing graph.db out of the way, (4) mv Life_v2.8draft5_mapcompat_13August2014.db graph.db, (5) start neo4j
19:19 jimallman thanks, i  was just about to ask for a link to how-to..
19:20 towodo it’s supposed to be done by the script.  I think if you just read install-db this is pretty much what it does. https://github.com/OpenTreeOfLife/opentree/blob/master/deploy/setup/install-db.sh
19:30 towodo josephwb, next time you prepare a database tarball, there is a way to do it so that it will work with the deployment scripts
19:31 jimallman https://github.com/OpenTreeOfLife/opentree/blob/master/deploy/README.md#how-to-push-the-neo4j-databases
19:31 josephwb where should i look for that?
19:31 jimallman josephwb: ^ link above
19:31 josephwb i'm too slow
19:31 jimallman towodo: yes, i’m following steps in install-db.sh
19:32 jimallman (not obsessively, just enough to follow your 1-2-3 above)
19:32 josephwb * not looking forward to working with ./push.sh -c again*
19:33 towodo you don’t have to, in this case it was just a question of preparing a db for use with the script
19:34 josephwb does the DB need to be named something specifically?
19:34 towodo yes - ‘.’
19:34 towodo see the instructions
19:37 towodo josephwb, I’d appreciate your comments on https://docs.google.com/document/d/1Ow70obuqaAS3Ga35yrjm95aN9GhDOkedPQ8ikL2g8Hk/edit
19:38 josephwb it is not clear what the DB name is supposed to be from the instructions. "graph.db"?
19:39 josephwb the example is "newlocaldb.db", but it is never renamed
19:40 josephwb or does the script do the renaming?
19:40 towodo you can call the db whatever you like locally.
19:40 towodo the tar command does -C and specifies capture of ‘.’
19:40 towodo are the instructions not clear?
19:40 josephwb yes, i see that
19:41 josephwb i think we are talking past eachother
19:41 towodo ok. what is the question?
19:42 josephwb when i said "does the DB need to be named something specifically?", i meant went pushing. seems the answer is "no"
19:42 towodo the instructions say “you can call it whatever you like locally”
19:43 towodo the instructions say to use rsync - that’s wrong, you use push.sh
19:43 josephwb i was confused by your: yes - ‘.’
19:44 towodo well that was a dumb joke I guess.  all directories are named ‘.’ so it’s trivial to get the .db directory to be called ‘.’
19:44 josephwb i realize that now
19:44 josephwb i thought ‘.’ was a failed emoticon
19:45 josephwb so i took it as: yes :-O
19:46 * jimallman is moving ahead with manual db install this time, just FYI … but it would be nice to use the push tools.
20:19 josephwb towodo: we had talked about having a permanent url for obtaining the most recent (current) version of ott. is that going to happen?
20:21 towodo I hadn’t seen it as a priority. what’s the use case?
20:22 josephwb not a priority
20:23 josephwb working on initializing a DB directly from the ott distribution. makes things like ott-version stable across DBs.
20:23 josephwb might be nice to have a script to obtain most recent ott
20:23 towodo I think the way to do it would be to enable .htaccess and then set up a redirect in a .htaccess file that changes with each new version
20:23 josephwb at the moment, in the DB the taxonomy version is freeform
20:24 towodo there’s a big chance for operator error with that solution, but maybe the .htaccess edit can be scripted
20:24 towodo you would have to read the version.txt file to find out what version you actually got
20:24 josephwb but since the version info is in the distribution, can't we have a "release" directory?
20:25 josephwb yes, i now read the version.txt file to get the version in the DB
20:25 josephwb what i have in mind:
20:25 josephwb ./get_ott
20:25 josephwb ./treemachine initialize_ott
20:26 josephwb again, not a priority
20:26 josephwb just trying to save users some work obtaining everything separately
20:26 towodo oh yeah… users…
20:26 josephwb at the moment I get ott, and do:
20:27 josephwb ./treemachine initialize_ott ott_distr_directory
20:27 josephwb i count on the files being named the same, etc.
20:27 josephwb more stable
20:28 towodo you have to be able to choose between released ott and draft ott, right? because sometimes you want one and sometimes the other?
20:28 josephwb yes, probably
20:28 josephwb "release" and "bleeding_edge"?
20:38 josephwb anyway, the idea was that since any work involving treemachine + phylesystem requires ott, it might be nice to easily get ott.
20:38 josephwb again, not a priority
20:43 jimallman josephwb: i believe i’ve installed the new neo4j db correctly, but neo4j fails to restart:
20:43 jimallman https://gist.github.com/jimallman/e79cd8e8ed5b920ea63b
20:44 jimallman (that’s a tail of all related logs during an attempt to restart with the new data)
20:44 josephwb * looking now *
20:44 jimallman i’m pretty sure i have the right files in the right place (manual placement)… it looks the same as the layout on production
20:45 jimallman this is the smoking gun, possibly a recent change to treemachine code or schema?  https://gist.github.com/jimallman/e79cd8e8ed5b920ea63b#file-gistfile1-txt-L72
20:45 josephwb i am on ot10 now looking at things
20:47 jimallman thanks. production (api.opentreeoflife.org) is ot18, if you want to compare. sadly, the graph.db.previous directory on ot10 is useless, since that’s my first failed attempt to install the new Life*.tgz
20:48 josephwb i saw that
20:48 jimallman any chance we’re running different versions of neo4j on this server and your box?
20:48 josephwb possibly
20:48 josephwb what is it there?
20:49 josephwb me: 1.9.M01
20:49 jimallman how can i tell? bin/neo4j doesn’t seem to have -v —version or the like.
20:49 josephwb neo4j info
20:50 jimallman i just get “Neo4j Server is not running”
20:50 josephwb huh. shouldn't matter
20:52 josephwb in neo4j/README it says 1.9.5
20:52 josephwb on ot10
20:52 josephwb neo4j-treemachine/README.txt
20:53 jimallman ah, thanks. so that’s a possible incompatibility.
20:53 josephwb but has it been updated recently?
20:53 towodo setup/install-neo4j-apps.sh says 1.9.5
20:53 jimallman (cleverly hidden in the README file!)
20:54 towodo neo4j 1.9.5 is the official neo4j of the open tree project
20:54 jimallman yes, and it sounds like the version on ot10 is inconsistent. so are you running the “wrong” version? i don’t know whether 1.9.M01 is newer, older…?
20:55 josephwb older, i think
20:55 josephwb i haven't upgraded since… ?
20:55 jimallman another possibility is a config issue. from what i’m reading, a bigger db might (ironically) require a smaller Java heap size:  http://stackoverflow.com/questions/24392033/unable-to-start-neo4j-server
20:55 josephwb the last DB (for the Science submission) was built on my machine
20:55 towodo http://www.neo4j.org/download/other_versions
20:56 jimallman (as a sidenote, untar’ing this monster filled the server’s hard disk. i had to throw some old stuff overboard just to unpack it)
20:56 josephwb wow
20:57 towodo (ouch - that’s a 250G disk)
20:57 josephwb it should have been 26 GB
20:57 josephwb the DB, that ios
20:57 josephwb is
20:58 towodo 1.9.8 and 1.9.5 are very similar. seems neo4j decided to call 1.9.8 a “stable release” and deprecate 1.9.5 after I made the decision to go with 1.9.5
21:00 jimallman they (neo4j) seem to have a notion of “milestone releases” (M0#?) across which they don’t guarantee db will upgrade automatically:  http://grokbase.com/t/gg/neo4j/135z3dqf2h/update-to-2-0-0-m03
21:01 towodo I still marvel that neo4j requires about 1000x as much space as an ascii file carrying the same information
21:01 jimallman hm, but it looks like this shouldn’t be an issue for this jump: http://docs.neo4j.org/chunked/snapshot/deployment-upgrading.html
21:02 jimallman i wonder if we’re *downgrading* in moving from 1.9.M01 to 1.9.5 … ?
21:02 towodo there used to be a web page listing differences between the various 1.9.x versions. can’t find it now
21:02 jimallman …but that shouldn’t matter unless the “store layout” has changed
21:02 josephwb i think we're good.
21:03 josephwb things definitely change on 2.0+
21:03 jimallman i can try ths “explicit upgrade” trick..: http://docs.neo4j.org/chunked/snapshot/deployment-upgrading.html#explicit-upgrade
21:03 towodo we decided not to touch 2.0+
21:04 josephwb yes, i know
21:07 towodo looks like 1.9.M01 is a very early version of 1.9
21:08 josephwb i am unconvinced it is a problem
21:09 josephwb but i will get the newer version and try to open my DB
21:09 towodo i too am unconvinced
21:09 towodo but the log message is “Unable to upgrade database”
21:10 jimallman https://github.com/neo4j/neo4j/releases
21:10 jimallman 1.9.M01 is about a year older than 1.9.5, but this page does say “Neo4j 1.9.5 does not require any store-level upgrades."
21:11 jimallman towodo: i gather this message sometimes indicates a different problem entirely, a neo4j configuration problem with the Java heap size:   http://stackoverflow.com/questions/24392033/unable-to-start-neo4j-server
21:12 towodo yeah…
21:12 jimallman this might be worth a try, since it seems this treemachine db is significantly larger than the previous one
21:12 * jimallman realizes this is approaching pure voodoo  :-/
21:12 towodo you are looking up how to set the neo4j heap size?
21:14 jimallman actually, i’d like to try the explicit upgrade trick first. it should either solve the problem or do nothing
21:15 towodo ok
21:17 jimallman no dice, reverting the config and will retry with Java heap changes…
21:22 jimallman again, no joy (same exceptions, same outcome)
21:23 towodo I need to go home… do we have a plan?
21:24 towodo josephwb, is the plan for you to get 1.9.5, build a new db, and we try that tomorrow?
21:24 towodo (gotta run, will check in later)
21:24 josephwb i can try that
21:25 josephwb 1.9.8, right?
21:25 josephwb i don't see 1.9.5 there
21:25 jimallman josephwb: https://github.com/neo4j/neo4j/releases
21:26 jimallman (links to https://github.com/neo4j/neo4j/releases/tag/1.9.5)
21:26 jimallman towodo: just as a sanity check, here’s what i have in the graph.db directory:
21:26 jimallman https://gist.github.com/jimallman/7d9c68c5848822b7e673
21:27 jimallman (just in case i screwed up the manual install)
21:29 jimallman ah well… fwiw, this seems to match production api (ot18)
21:29 josephwb yes
21:30 jimallman incl. ownership and permissions, which i sometimes miss
21:38 jimallman josephwb: interesting maybe here… i see that the original exception is in sun.nio.ch.FileChannelImpl.position, which i gather handles buffered file i/o..? the error is around line 275, so I’m guessing it happens here (“throw new IllegalArgumentException()” in line 274): http://pag-www.gtisc.gatech.edu/chord/examples/deadlock_test_old/sun/nio/ch/FileChannelImpl.java.html
21:39 jimallman possibly a problem with super-size files? i see that ‘neostore.propertystore.db.arrays’ in our new db is ~ 4GB
21:39 josephwb maybe?
21:40 jimallman this is not a complete wild guess.. i’m seeing reports of this error, and a change from int to long to handle position arguments in larger files
21:42 jimallman for example: https://issues.apache.org/bugzilla/show_bug.cgi?id=56447
21:43 jimallman so a bump up in FileChannelImpl version might be worth a try..? more maven fun!
21:43 jimallman but first let me check file sizes on production, in case this is a wild goose chase
21:43 jimallman red herring, that is
21:46 jimallman hm, it’s the *production* version of neostore.propertystore.db.arrays that’s 4GB. the new one is more like 6GB. so maybe, but they’re both hefty files.
21:52 * jimallman is going to un-tar the db again in case something went wrong when i ran out of hd space (back in a git)
21:52 jimallman back in a bit!
21:57 josephwb jimallman: i got 1.9.8 and things seem fine the the DB
22:08 josephwb ok, i need to walk away from this for a while
22:09 josephwb i will see about getting 1.9.5 and re-running things
22:44 josephwb joined #opentreeoflife
23:40 jimallman UPDATE: I cleared some more space and untar’d the database. After another manual restart of neo4j, all’s well

| Channels | #opentreeoflife index | Today | | Search | Google Search | Plain-Text | summary