Perl 6 - the future is here, just unevenly distributed

IRC log for #pdl, 2013-05-04

| Channels | #pdl index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
09:20 ilbot2 joined #pdl
09:20 Topic for #pdl is now Install PDL: http://pdl.perl.org/?page=install  | Book: http://pdl.perl.org/content/pdl-book-toc.html | Mailing list: http://pdl.perl.org/?page=mailing-lists | Pasting: http://scsys.co.uk:8001/pdl | Channel is logged by ilbot2: http://irclog.perlgeek.de/pdl/today
12:28 chm joined #pdl
13:29 dimuls joined #pdl
13:29 someanon joined #pdl
13:37 someanon hi, guys! I need help. A want to create very huge and sparse matrix (22154 cols, 10226 rows). I haven't anough memory. It dies "Out of the memory". I tried PDL::Sparse, but it can't be writen to file using PDL::IO::FastRaw::writefraw. I tried PDL::IO::mapfraw, but it also dies 'Our of the memory'. Any ideas?
14:25 chm joined #pdl
15:20 chm left #pdl
15:32 someanon ping
15:35 pdurbin someanon: the experts are idling, apparently :)
15:51 run4flat_ hola
15:51 run4flat_ someanon, I have an idea
15:51 run4flat_ :-)
15:52 run4flat_ The problem with using mapfraw is that it will probably "physicalize" the piddle before storing the data
15:52 run4flat_ i think
15:52 run4flat_ so then creating the file in the first place is tricky
15:52 run4flat_ but, if you created your huge matrix by hand on the hard drive, by writing out the binary
15:53 run4flat_ then i would have expected mapfraw to work
15:53 run4flat_ I bet you probably didn't build the binary data file by hand, though. :-)
15:54 run4flat_ however, it sounds like you just need a way to store PDL::Sparse to file
15:54 run4flat_ so maybe fastraw and/or mapfraw isn't the right answer
15:54 run4flat_ It might be good for somebody (ha!) to write some sort of file storage format for PDL::Sparse
15:56 run4flat_ i can help step you through that, but I gotta go right now
15:56 * run4flat_ morphs back into joel
15:56 jberger Bender1, share the love
15:56 pdurbin huh? :) nice morph :)
15:56 jberger Bender1, spread the love
15:57 jberger pdurbin, tricky eh?
15:59 pdurbin jberger: so you gotta go or not?
15:59 pdurbin 'cause I have a math question
16:00 pdurbin well, I guess all I'm wondering is if concepts such as vector space model are on topic for this channel. please see http://irclog.greptilian.com/sourcefu/2013-05-04#i_5752 for more on this
16:06 jberger indeed, both run4flat and I have to go, but we will be happy to look at it I'm sure
16:17 pdurbin jberger: cool. I'm in and out myself. I'm just wondering if you guys know this stuff cold. cool that it's on topic
16:26 pdurbin run4flat: FYI, I opened this issue for you: package perl-PDL-Graphics-Prima · Issue #267 · repoforge/rpms - https://github.com/repoforge/rpms/issues/267
16:50 someanon nice:3
16:53 someanon i will try to make stored file by hand at first
16:53 someanon it's just 800mb
16:54 someanon and maybe then try to write PDL::Sparse dumper
16:55 someanon run4flat: thanks
17:22 someanon joined #pdl
17:46 someanon dd if=/dev/zero of=document-term-model count=3539793
17:46 someanon this creates needed dump file
17:47 someanon 1.7 gb
17:47 someanon 6 2 22154 10226
17:47 someanon this is .hdr fle
17:52 someanon still out of memory
17:53 someanon bad):
17:58 chm joined #pdl
18:04 chm someanon, I think the problem is you are running out of memory on your system, not that PDL cannot handle a file that big.  The current PDL indexing should work for the number of elements you have.
18:04 chm What is your perl info and PDL info (paste output of perldl -V)?  How much memory do you have on your system?..
18:07 chm Are you running in a PDL shell or is this in a program/script?
18:09 chm If you start small, how big can you get?  E.g., what is the largest piddle you can create in 1D?  $pdl = sequence($n) for $n=1,2,4,...65536...
18:23 someanon joined #pdl
18:27 someanon chm: perl -V http://paste.ubuntu.com/5633128/ and perldl -V http://paste.ubuntu.com/5633129/
18:29 someanon chm: i have 4 gb memory and free 2gb
18:30 someanon i run perl script
18:32 chm Do you set $PDL::BIGPDL?
18:33 someanon chm: like that: $PDL::BIGPDL = 1;
18:33 someanon ?
18:33 chm yes
18:33 someanon yes
18:33 someanon it warns me
18:33 someanon chm: Name "PDL::BIGPDL" used only once: possible typo at ...
18:34 chm It should be $PDL::BIGPDL
18:34 someanon it is
18:35 someanon $PDL::BIGPDL = 1;
18:35 chm ok, you typed 'PDL::BIGPDL' which is not a scalar
18:35 chm Do you have a short code segment that shows the problem?
18:35 chm Maybe something else is going on....
18:36 chm I also assume you have a *large* amount of swap space.
18:36 chm It could be intermediate temp values blowing the memory limit
18:36 someanon http://paste.ubuntu.com/5633153/
18:37 someanon i have not swap^3
18:37 someanon my ssd scares it
18:38 someanon chm: :3
18:43 chm The code you pasted doesn't run.  Do you have something stand-alone?
18:45 chm Also, if you don't have swap, then you'll easily run out of memory working with a 1.7GB piddle.  The code you posted looks like it has lots of other objects/data in it.
18:46 chm You could try running under the perl debugger, stepping through a line at a time, then using the top command to monitor memory usage.
18:46 chm That might give you an idea of where the problem is coming in.
18:47 someanon a move lines with "my $DT = ..." to the top and add after that "exit 0";
18:48 someanon so it run out in mapfraw
18:49 someanon chm: other data is about 200mb
18:50 someanon *fix: i move lines
18:50 someanon moved
18:52 someanon chm: perl -e ' use PDL; $PDL::BIGPDL = 1; my $d = zeros(16700, 10000);' worked
18:53 someanon chm: perl -e ' use PDL; $PDL::BIGPDL = 1; my $d = zeros(16800, 10000);' not
18:53 someanon chm: so, i don't have enought memory?
18:54 chm I think that is the problem.  If you add swap, can you go larger?  I'm afraid I have to go for now.  o/
19:20 someanon joined #pdl
19:21 someanon with 2gb swap it is done
19:33 someanon joined #pdl
19:35 someanon run4flat: ping
20:35 someanon run4flat: looks like PDL::Sparse already have needed function http://cpansearch.perl.org/src/KWILLIAMS/PDL-Sparse-0.01/Sparse.pd
20:35 someanon read_from_dir and write_to_dir
20:37 someanon but they are absent in doc
20:52 someanon it's strange..
21:21 someanon joined #pdl
21:59 jberger pdurbin, that seems to be like it could be on topic, but I don't know anything about it
22:34 someanon run4flat: the functions are incorrect. I correct them. Now they use YAML::LoadFile and YAML::DumpFile for dump and load indexes and values
22:34 someanon seems work perfect
23:45 someanon hm, now i need simple slices on PDL::Sparse.
23:46 pdurbin jberger: ok. I feel a little better :)

| Channels | #pdl index | Today | | Search | Google Search | Plain-Text | summary