The web in a box - a next generation web framework for the Perl programming language

IRC log for #mojo, 2016-05-15

| Channels | #mojo index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:04 punter joined #mojo
00:33 damaya joined #mojo
00:33 damaya In this example, http://mojolicious.org/perldoc/Mojolicious/Plugin/DefaultHelpers#delay, how does each step indicate to move to the next step?
00:33 damaya In other words, in delay steps, how do I say "this step is done, move on."
00:34 preaction damaya: when the begin callback is called
00:35 bpmedley damaya: A great question.  Doesn't $delay->begin return a callback that allows the step to continue when called?
00:35 preaction yes. once all the begin callbacks have been called, then the step is done and the next step is allowed to run
00:36 damaya Ah, ok excellent, so the $delay->begin is necessary :)
00:37 preaction yes. otherwise, it's not really async
01:11 vicash hello. while using Minion, is there a way to enqueue a task that depends on the executed results of a set of independent tasks before it. Similar to a scatter-gather type operation, where say I use a cluster to execute 100s of tasks and then run a single task after those 100s of tasks are completed to perform a "reduce/gather" operation on the results of each of those tasks.
01:18 vicash I guess I can enqueue the tasks on a custom queue and test their completion before doing the reduce operation.
01:47 _fildon_ joined #mojo
02:03 lluad joined #mojo
02:06 noganex joined #mojo
03:31 zivester joined #mojo
06:01 dod joined #mojo
06:07 dod joined #mojo
06:16 damaya joined #mojo
06:24 kaare joined #mojo
08:15 Vandal joined #mojo
08:17 kaare joined #mojo
09:12 Vitrifur joined #mojo
10:10 eseyman joined #mojo
10:28 meshl joined #mojo
10:35 punter joined #mojo
10:47 kaare joined #mojo
12:17 damaya joined #mojo
12:25 AndrewIsh joined #mojo
12:43 jberger vicash: job dependencies has been discussed before
12:44 jberger Interestingly I might have an implementation that could work for pg
13:11 kaare joined #mojo
13:38 Kripton joined #mojo
14:05 punter joined #mojo
14:06 zivester joined #mojo
14:18 vicash jberger: is your implementation available for viewing ?
14:33 jberger Not yet
14:33 jberger But i hope to soon
14:51 odc joined #mojo
14:55 vicash cool
15:09 PryMar56 joined #mojo
15:18 Kripton joined #mojo
16:04 jberger vicash: absolutely no guarantee of this going in, minion is actually getting a little "full" for adding lots of features, but here is what I have extracted from some code developed for $work: https://github.com/kraih/minion/issues/27#issuecomment-219294261
16:46 sri joined #mojo
16:47 sri o/
16:48 jberger sri: \o
16:49 bpmedley \o/
16:50 jberger sri: it seems to cost a few hundred rps
16:50 jberger that that is on the existing benchmark (ie what do people pay who aren't using it)
16:50 jberger but let me see if I can add a shortcut check first
16:57 sri you'd think it could cost nothing for people that don't use dependencies
16:57 sri if the order of the checks is right
16:57 jberger yeah
17:00 jberger this is within my margin of error http://paste.ubuntu.com/16442647/
17:00 jberger of the original
17:01 jberger let me try to make it simpler by allowing parents to be null rather than start at []
17:03 Kripton joined #mojo
17:05 jberger this one is simpler, but might actually be a tad bit slower: http://paste.ubuntu.com/16442776/
17:07 sri so, the first one is less than 100 rps slower here
17:07 sri more like 50 rps
17:07 sri once the query planner was warmed up
17:07 sri takes a few runs here to warm it up
17:07 jberger I noticed that
17:08 sri postgres 9.5.3
17:08 sri that is 50 rps at 2550 rps
17:08 sri <3 core m7
17:10 sri really fun to watch this little gadget to get an idea for what the cpu is doing https://software.intel.com/en-us/articles/intel-power-gadget-20
17:10 * jberger installs
17:11 sri 1.3ghz cpu sounds kinda slow, but when there's work it stays 100% in turbo, bursts to 3.1ghz and throttles to 2.4ghz when the cpu temp starts rising
17:15 sri jberger: ok, for the second version i don't notice a difference to the original
17:15 sri as in it's just as fast as without dependencies
17:15 jberger cool
17:16 sri don't care about simpler when there's a performance cost
17:17 jberger do you see a performance hit on the third one?
17:17 sri have not tested since i'm too lazy to change the schema ;p
17:18 jberger if you see only 50 rps hit between master and the first one then I doubt you'd see any difference between the last two patches
17:18 jberger my problem is I see more run to run variance than you seem to be, which mean's I don't know if I can quantify to that level
17:19 jberger I'm trying out running the benchmark script in a bash loop to let the cpu spin up
17:20 sri think i like it more when the default is an empty array
17:21 sri not sure
17:23 jberger ok I see now effectively no difference (on perceived average) between master and the latter two patches
17:24 sri one more test you can do to really get a feel for the cost
17:24 sri remove the ->finish from the dequeue loop in the benchmark
17:25 sri then you can test pure dequeue performance
17:29 sri that does confirm that the second and third versions are almost equal to master
17:29 sri at 4650 dequeues/s
17:30 sri what do you get on your i7?
17:31 sri with 4 cores you should get a really nice boost
17:33 jberger I'm in the 5100rps ranges
17:33 sri Oo
17:33 jberger without finish
17:34 sri interesting, would have expected more
17:35 jberger all kinds of possible variantion though
17:35 jberger I'm on battery for example
17:35 sri ah
17:36 jberger plus I know what this laptop sounds like when its working hard and I don't hear that
17:36 jberger when I play Cities: Skylines it gets loud
17:36 jberger and hot
17:37 sri cities: skylines is crazy cpu heavy, can't play it on os x here
17:37 jberger I can, I <3 that game so much
17:37 jberger I have to be on AC though, it sucks the battery dry
17:39 sri anyway, this is definitely a candidate for minion core
17:40 sri just needs a strategy for cleaning up abandoned jobs
17:40 jberger the next question is should there be cleanup either directly via repair (check for parent rows that are failed and fail them) or via an "expires" time
17:40 jberger I was just typing that
17:41 sri repair seems cleaner
17:41 sri expires could be orthogonal
17:41 jberger so failed parent implies failed child, yeah
17:42 sri only problem is retried jobs
17:42 jberger that's where an "expires" field might be nice
17:42 jberger gives you a window to retry a parent
17:43 jberger but doesn't necessarily have to wait until the normal repair cleanup
17:46 sri how expensive is expires?
17:47 jberger hehe, i never actually implemented that one since it was clearly possible :D
17:47 jberger at $work I had this super complex locking requirement that was much more challenging
17:48 sri all solutions seem to be using triggers, which looks rather expensive
17:48 jberger for expires?
17:48 sri yes
17:48 jberger oh I was just going to have a column with a time in the future and add it to the dequeue and repair queries
17:49 sri hmm
17:49 sri suppose that works to
17:49 sri o
17:49 jberger not as immediate as a trigger would be
17:50 sri i guess it doesn't really matter for all the other queries
17:51 jberger that was mostly inspired by our use case though
17:51 jberger "install this system, then reboot the server" and on the reboot job add an expires so that if it didn't happen within an hour or two it wouldn't accidentally trigger in a day or something
17:51 sri much better for performance for sure
17:53 sri yes, i guess i'm actually ok with that implementation
17:53 jberger full disclosure, it sadly looks like we are probably going to have to write our own job runner system, which is why I want to at least propose merging this work into minion
17:54 sri batches+expires
17:54 jberger the model we need for job dequeue and locking is just too far outside of the minion paradigm
17:54 sri well, told you that was my expectation anyway ;p
17:54 jberger yeah, I know
17:55 jberger I just want it known that I'm not angling this for me but as a feature that has been requested many times
17:55 jberger I want the implementation to help most users not just how I envisioned it for ServerCentral
17:55 sri and i'm only ok with it because the implementation is good :)
17:56 jberger of course, I'd expect no different
17:56 jberger it was actually kinda a special case of the more general locking mechanism that I had working
17:57 jberger I might still show that one off
17:57 sri the more i think about the cleanup problem, the more expires makes sense
17:57 jberger but I don't think it belongs in minion core
17:57 jberger (my locking solution)
17:57 sri i suppose you'd want some time to review failed jobs and restart them manually before the whole batch fails
17:59 sri what does expires mean though? deletion or failed?
17:59 jberger I'd say failed
17:59 sri i'd assume failed with Expired error
17:59 jberger right
17:59 sri ok
18:00 sri +1 then
18:01 jberger I'll work up a patch for expires and open a PR
18:01 sri that leaves us with the second variant (default empty array), and expires in dequeue/repair
18:01 jberger you like the empty array better than check for null
18:01 sri yea, it's nicer for job_info
18:01 jberger ah, true
18:02 sri job_info will also get a children value
18:02 jberger its kinda odd that array_length('{}'::bigint[], 1) returns NULL
18:02 jberger I'd have thought 0
18:03 jberger how do you calculate children?
18:03 sri just immediate children
18:03 sri select for ids with a parent value of the current id
18:03 sri that's enough to show it in an admin ui and walk the graph
18:05 jberger yeah, that's neat
18:05 sri job_info is not performance critical either
18:06 sri enqueue, dequeue, repair and stats are the ones i focus on for performance
18:09 sri not sure if it's worth caring about cycles
18:10 sri expires kinda takes care of it, but leaves it up to the user
18:10 jberger cycles?
18:10 jberger like parent -> child -> parent ?
18:10 sri yes
18:10 jberger its kinda hard to do since you need to know the id to specify as the parent
18:11 jberger I guess you can if you have a longer chain (as I just showed)
18:11 jberger but "don't do that" works for me :p
18:11 sri or am i wrong and you can't really do it?
18:12 sri suppose you'd have to guess the id
18:12 sri that would be weird
18:12 sri non-existing id is a case though
18:12 jberger yeah, you're right, I don't see how you could do it
18:12 sri needs to be tested
18:13 jberger I guess so, but I think the query will take care of that
18:13 sri think so too
18:13 sri still needs a test :)
18:13 jberger use -1 as an id?
18:13 jberger parent id that is
18:14 sri sure
18:14 jberger oh, well there is one issue then
18:14 jberger what if I depend on job 12 and job 12 is repaired away (cleanup 2 days)
18:15 jberger then suddenly the child job is live
18:15 jberger I guess that's why you set expires
18:15 punter joined #mojo
18:15 sri yes
18:16 dod joined #mojo
18:16 sri i think there is no way to avoid that case
18:16 jberger agreed
18:16 sri even if repair fails all depending jobs immediately, the repair might happen too late and a user already removed the failed parent
18:19 jberger I guess that means documenting the relationship between parents and expires too
18:28 sri oh hey, this job queue has the same dependency implementation http://python-rq.org/docs/
18:29 sri they call it depends_on
18:29 jberger I like parents if we are going to have a way to see children
18:29 jberger but generally I don't hate depends_on
18:30 sri think i'm ok with parents/children
18:30 jberger "A job that is dependent on another is enqueued only when its dependency finishes successfully."
18:31 sri you mean dequeued
18:31 jberger if I wrote that same sentence for this implementation I would change s/enqueued/dequeued
18:31 sri hehe
18:31 jberger no that's from the docs you linkd
18:31 sri oh
18:32 jberger I wonder if it actually defers enqueuing
18:32 jberger that would mitigate the repair early problem
18:33 sri it does
18:33 sri there's temporary storage, which gets checked when a job finishes
18:33 jberger ah
18:33 PryMar56 joined #mojo
18:35 sri these redis based queues have so many places where jobs can get lost
18:36 sri when those jobs are taken out of temp storage, they would just vanish if the worker dies before it could enqueue them
18:36 jberger there are ways you could to that without temporary storage
18:37 jberger an enqueued flag on the job
18:37 jberger or else a pre-inactive state
18:37 sri was about to say state
18:40 sri btw. sidekiq batches work very differently, you enqueue a bunch of jobs together, then they run parallel, and the last one runs a special callback
18:41 jberger when I was working on my extensions you were surprised to see me implement a parallel feature
18:43 sri haha, this one does it the same as us https://github.com/socialpandas/sidekiq-superworker#superjob-expiration
18:43 sri apparently the most popular sidekiq solution
18:44 sri although the expiration is tied to a superjob, which encapsulates the subjobs
18:46 sri jberger: back to the non-existing parent case
18:46 sri those jobs should not be able to run right?
18:47 sri that solves all the edge cases i think
18:49 sri aside from never getting removed from the queue, which is merely a doc problem i think
18:50 sri suppose repair could even cover that
18:50 jberger yeah, though it makes the query harder (I think)
18:50 jberger I'm eating lunch atm, brb
18:51 sri guess i'll be eating ice cream :)
18:52 AndrewIsh joined #mojo
18:52 jberger nice
18:52 sri developing an addiction to these :S http://www.benjerry.com/files/live/sites/systemsite/files/flavors/products/eu/wich/novelties/son-of-a-wich-detail.png
18:54 jabberwok Mojo documentation index auto-generated at http://wlindley.com/mojo/Mojo.html ... created with https://github.com/lindleyw/pod-index (finally got that index generator quarter-way presentable)
18:54 jberger sri: wow
19:02 jberger jabberwok: that looks interesting, I don't have time to dig it at the moment
19:22 jberger I'm trying to adapt the query to require all the parent rows
19:42 jberger sri: https://github.com/jberger/minion/commit/883e217fa2de34cfc7fb1177e9a76d32ab13006a
19:43 jberger cardinality is a more sensible length (I think)
19:47 lluad joined #mojo
20:12 punter How can I "construct" a controller object $c, so that I'll be able to call $c->url_for(...) from within a cron script, rather than from a web request? Or is that something I shouldn't even attempt to do?
20:14 sri pretty sure that cookbook recipe has been linked to you multiple times now
20:14 punter YES! But I forgot where it was. Now, thanks, I'll check the cookbook! Thanks for pointing
20:16 sri jberger: i guess expires is not really needed for job dependencies
20:16 sri so, do we have enough good use cases anymore?
20:17 jberger put it a different way, it is probably a different feature at this point
20:17 sri yea
20:17 jberger one thing postgresql is not good at is preserving an intentional null
20:18 sri now that jobs with a non-existing parent just don't run, repair can handle everything on its own nicely without edge cases
20:19 * jberger throws away the messy query that enqueue became with expires and wipes brow
20:20 sri the feature also needs a name, batches sounds a bit meh, maybe job dependencies, or even a more pretentious job dependency graphs
20:20 jberger job parents? its what we are actually calling it
20:20 sri don't like it
20:20 jberger dependencies would be my next choice
20:21 jberger graphs isn't wrong
20:21 sri the names parents and children are orgnizational, they don't describe the feature
20:21 jberger do you like the implementation of required parents?
20:23 sri looks good, have not checked performance yet
20:25 jberger nor have I
20:25 sri :o
20:25 jberger I was working on expires :s
20:27 sri expires is so easy to do on your own
20:28 sri if you need it for a specific task
20:28 sri just put the time in an argument and put a check in the task sub
20:28 jberger right
20:31 jberger curse this variability
20:33 meshl joined #mojo
21:04 jberger heh I can tune the benchmark where it almost entirely is based on the cpu temp'
21:06 sri haha
21:11 jberger the funny thing is, when I do shorter runs (ie not temp based) it looks like I see a small performance knock with either patch now
21:11 jberger either meaning first or both commits
21:12 jberger I also have the pg on the same computer, maybe it would help not to do that
21:12 jberger I suppose I could try on my colo
21:13 jberger I don't have pg installed there yet though
21:23 punter joined #mojo
21:39 punter joined #mojo
22:12 eseyman joined #mojo
23:35 zivester_ joined #mojo
23:42 sri some opinions if this should be minion core might also be nice

| Channels | #mojo index | Today | | Search | Google Search | Plain-Text | summary