Perl 6 - the future is here, just unevenly distributed

IRC log for #pdl, 2014-01-03

| Channels | #pdl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:21 drrho joined #pdl
00:39 drrho joined #pdl
02:33 drrho joined #pdl
03:18 jberger_ joined #pdl
04:36 drrho joined #pdl
05:04 drrho joined #pdl
15:23 drrho joined #pdl
17:38 vicash is there a function in PDL like sapply() or apply() in R ? I want to perform a quick lambda style function on certain slices of a large PDL but I can't seem to find an example.. maybe i am not looking in the right document. a document suggestion would be fine too.
17:52 vicash i guess i can use 'map' as it solves my problem. but i was wondering if there is something else as well....
18:01 sivoais vicash: what are you looking to do? Does <http://pdl.perl.org/PDLdocs/Ufunc.html> fit the bill? There's also an alternative interface that might be closer to the the apply() way of doing things: <http://pdl.perl.org/PDLdocs/Reduce.html>.
18:04 vicash i am writing a function for calculating moving average of a 1D pdl. i did it using "map { $a($_ - N : $_ - 1) } N .. $a->nelem" where $a is the pdl .
18:05 vicash sivoais: oops.. "map { $a($_ - $N : $_ - 1)->avg } $N .. $a->nelem" where $a is the pdl and $N is the number of elements over which the average is being done
18:06 vicash this produces a pdl that can be used.
18:29 drrho joined #pdl
18:34 vicash when i compare 2 pdls i use the "==" overloaded operator but I get a resultant piddle with say all ones or some ones or zero. i just want a single flag that says 2 piddles are the same or not. how do i do that ? do i have to use any() ?
18:37 vicash so i use all($a == $b) to get my answer. is this the right way or is there something more elegant ?
18:39 sivoais vicash: yes. How piddles should behave in a boolean context to make them Perlish is still being discussed. You'll want to use any() or all(). Some thing like all ( abs( $x - $y ) < $tol ) would be more robust to FP precision
18:39 vicash thanks. that makes sense
18:51 chm joined #pdl
18:54 chm vicash : Probably the fastest way to calculate a moving average is with conv1d and the kernel is ones(N)/N
18:55 vicash chm: thanks. i will try that. how do i benchmark the speed ?
18:58 chm using range() is more generic but probably more memory intensive.
19:05 chm Here are a couple of snippets from an example session:
19:05 chm $in = (10 * random(10))->floor
19:05 chm $kern = ones(3)/3
19:06 chm p conv1d $in, $kern  # this has wrap-around boundary conditions
19:06 chm [ 7.6666667 5  2.6666667  1.6666667  3.3333333  3.3333333 3  4.3333333  6.6666667  8.3333333]
19:06 chm p $in
19:06 chm [8 7 0 1 4 5 1 3 9 8]
19:08 chm p $in->range($in->xvals->(*1)-1,3,3)->mv(1,0)->average
19:08 chm [ 7.6666667 5  2.6666667  1.6666667  3.3333333  3.3333333 3  4.3333333  6.6666667  8.3333333]
19:08 chm As for timing, there is a standard Benchmark module you can use to measure performance
19:22 sivoais on my machine, the convolution is twice as fast, but I'm not getting just a subsequence of the filter like the map{} does
19:25 vicash sivoais: yes in the moving average of say 5 elements for an array/pdl of size 50 you will have a resultant set of 45 elements
19:26 vicash or 46 :)
19:26 vicash the conv1d gives 50 with the ordering being slightly modified.
19:26 vicash the range() formula doesn't work for me
19:26 vicash i get boundary condition issues
19:29 sivoais vicash: <https://gist.github.com/zmughal/8244764> <-- this is my code
19:30 sivoais that needs to be floor($N/2), heh :-P
19:32 sivoais and the tolerance could be smaller
19:37 * vicash checking the gist
19:40 vicash sivoais: i get invalid slice string
19:41 vicash on line 40
19:42 vicash the central_idx is a PDL of size > 40
19:45 vicash i fixed it by doing $mv->($central_idx(0):$central_idx(-1))
19:46 vicash i see on my macbook air an order of 10 times speed up between the map and convolution
19:49 sivoais ah, I might have been seeing a smaller speedup because I was testing with smaller data
20:05 drrho joined #pdl
20:30 drrho joined #pdl
20:47 sivoais vicash: how much tolerance can you accept? I was playing with different values (nelem = 200 and and N = 20) and I needed to set the $eps = 5e-1
20:49 chm joined #pdl
21:03 vicash sivoais: sorry was away from my desk. the central_idx in your function can be made faster can it not since we need only the first and last index ?
21:03 sivoais yeah, I made that change in the gist when I got back
21:04 sivoais but it doesn't change the benchmark by much for small $N
21:04 vicash sure. to answer your tolerance question i think that should be left to the user
21:04 vicash does PDL keep every floating point as a 64/80 bit double or does it implement it as a float ?
21:05 vicash as in a 4 byte float
21:09 sivoais depends on the type you give the constructor <http://pdl.perl.org/PDLdocs/Core.html#pdl>
21:09 sivoais "The default constructor uses IEEE double-precision floating point numbers."
21:10 vicash ok. well then the user could use 1e-16 or 1e-9 as a good tolerance.. it really depends on the precision of the values they are using
21:11 vicash for example, if you're looking at say prices of the Euro with respect to the US Dollar, it will always be no more than 4 or 5 decimal places. then the tolerance can be 1e-6 even or 1e-9
21:25 vicash chm: the conv1d is definitely faster by an order of magnitude. thanks.
21:32 vicash sivoais: i did a 50000 size pdl with random numbers and did a movavg with both conv1d and map a 1000 times using your script. the conv1d finishes in under 4 seconds but the map is still running :)
21:36 sivoais :-)
21:49 chm joined #pdl
21:51 chm vicash : There is a fair amount of fixed overhead in the creation of a PDL object.  Avoiding new piddle allocation in a loop can speed things up.
21:52 chm Also, slice operations are much faster than index operations.  If $No2 = int($N/2) then $mv->slice( "$No2:-1-$No2" ) is another 2.5X faster
21:53 chm Than the $central_idx = sequence($p->nelem - $N + 1) + floor($N/2); and $mv->slice( $central_idx );
21:55 vicash chm: cool. yes i got rid of the index and used the direct slicing as it is faster.
21:55 sivoais I like that "$No2:-1-$No2" slicing going backwards. I keep forgetting PDL can do that. Too much time spent in MATLAB ;-)
21:56 chm Similarly, if you take the kernel generation out of the loop, it will speed things up as well.
21:56 chm I.e., the $mv_avg_filter = ones($N) / $N;
21:57 sivoais hmm, that could be cached by $N
21:57 sivoais let's try that
22:05 sivoais ok, so that's one place a JIT would do very well (if it were running in a loop like that)
22:15 vicash sivoais: if you use N as an even number then the floor() is not accurate
22:16 sivoais off-by-one?
22:16 vicash yea
22:31 vicash sivoais: $mv( floor(($N - 1)/2) : -1 - ceil(($N - 1)/2))
22:33 sivoais mmmmm, nice. That looks pretty clean
22:34 chm Usually smoothing kernels like these are of odd size so that they don't cause a half-unit shift of the smoothed values from the input values
22:35 sivoais Yeah
22:35 vicash ok. but i just wanted to make sure it was correct just in case.
23:12 jberger_ joined #pdl
23:13 jberger_ sri: was at $work all day
23:13 jberger_ I have a better idea than coderef in a scalar actually ;-)
23:15 jberger_ Cookbook recipe coming
23:15 jberger_ Probably some time tonight
23:15 jberger_ I'm still on the train now
23:15 jberger_ Then we have to stock up for the freeze-in
23:16 sivoais jberger_: I think you're posting in the wrong window :-P
23:16 jberger_ Blast!
23:16 sivoais :-D
23:20 jberger_ sivoais even worse since it takes so long to type in this smart phone
23:21 jberger_ Thanks
23:21 sivoais you can always link to the logs: "I typed this over there *points*" :-)
23:23 jberger_ Hehe
23:58 vicash Can I use PDL::Exporter to get my custom function to work like this : $p->my_func(@args) where $p is a PDL object instead of having to do my_func($p, @args) ?
23:59 vicash i am not able to do that despite using PDL::Exporter. I thought that was the whole point of using it

| Channels | #pdl index | Today | | Search | Google Search | Plain-Text | summary