Camelia, the Perl 6 bug

IRC log for #bioclipse, 2009-10-01

| Channels | #bioclipse index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
05:17 egonw joined #bioclipse
06:53 Gpox joined #bioclipse
07:16 egonw_ joined #bioclipse
07:59 egonw @tell jonalv http://chem-bla-ics.blogspot.com/2009/10/p​rocessing-chebi-mdl-sd-file-with-cdk.html
07:59 zarah egonw's link is also http://tinyurl.com/yea5vv4
07:59 zarah Consider it noted.
07:59 egonw @tell olas http://chem-bla-ics.blogspot.com/2009/10/p​rocessing-chebi-mdl-sd-file-with-cdk.html
07:59 zarah egonw's link is also http://tinyurl.com/yea5vv4
07:59 zarah Consider it noted.
07:59 egonw @tell masak can I do something like '@tell olas, jonalv FOO' ? if not, please consider this a feature request
07:59 zarah Consider it noted.
08:00 egonw @tell masak exact syntax does not matter
08:00 zarah Consider it noted.
08:14 shk3 joined #bioclipse
08:25 egonw_ joined #bioclipse
09:18 olass joined #bioclipse
09:27 * egonw is at home
09:28 egonw unplanned, but was not feeling well yesterday afternoon and evening
09:28 egonw moreover, late at home (22:15 or so)
09:32 * egonw profiling SDF property reading...
09:32 egonw see my blog
09:59 masak joined #bioclipse
11:03 samuell joined #bioclipse
11:11 olass joined #bioclipse
11:22 egonw olass: confirmed that reading the metadata is the bottleneck
11:22 olass what metadata is it?
11:22 egonw 99% of the time is spent on reading the SD fields
11:22 olass why does this take so much time?
11:23 egonw very much metadata
11:23 egonw water
11:23 egonw that has a lot of information
11:23 egonw links to whatever...
11:23 egonw it's the String building really
11:23 egonw just very, very much String building
11:23 olass ok, using StringBuffer I hope?
11:24 egonw internally it always is...
11:24 egonw but, gonna do some tuning now...
11:24 olass I see
11:24 egonw should be possible
11:24 egonw since that has never been done
11:24 olass maybe we should stick to CHebi_lite,sdf for testing?
11:24 olass without the metadata?
11:26 egonw :)
11:26 egonw yes, was thinking that too :)
11:26 egonw but then say (like Lilly Allen is now singing on the radio) to the user... F**k you very, very much ?
11:26 Gpox use StringBuilder if thread safety is not needed
11:26 egonw just use a lite SD file, not the heavy one you are using...
11:26 egonw anyways...
11:27 olass yes, agreed
11:27 olass it does not solve the problem
11:27 egonw made the use of stringbuffer explicit now...
11:27 egonw let's see what boost that gives
11:28 olass Gpox: What is StringBuilder? Better than StringBuffer?
11:29 masak differ in thread safety, methinks.
11:29 olass aha
11:30 masak I never remember which is which, though :)
11:30 Gpox and speed
11:32 olass Gpox, egonw, masak: Don't forget to push what you have for the devel release tomorrow. I will make the 2.2.x branch at that time (not today)
11:32 olass so tomorrow noon I guess
11:33 masak I will be pushing this afternoon.
11:33 olass \o/
11:38 egonw I'm pushing the JCP updates tonite or tomorrow early in the morning...
11:41 egonw masak: hahahaha tvimter ?
11:47 masak TwitVim, apparently.
11:55 egonw using stringbuilder explicitly fixes the problem
11:55 egonw working on patches for 2.0 and 2.2
11:55 egonw the improve is incredible
11:56 egonw mind blowing
11:56 egonw but I don't see any diff between SBuilder and SBuffer
11:59 egonw but I can live with the theory and will use SBuilder
12:10 samuell joined #bioclipse
12:16 masak egonw: StringBuilder: "A mutable sequence of characters.". StringBuffer: "A thread-safe, mutable sequence of characters." -- http://java.sun.com/j2se/1.5.0/doc​s/api/java/lang/StringBuilder.html http://java.sun.com/j2se/1.5.0/doc​s/api/java/lang/StringBuffer.html
12:16 zarah masak's link is also http://tinyurl.com/7fve4
12:16 masak so StringBuffer is the threadsafe one. rule of thumb: the StringBuffer gives you a 'buffer' of safety against thread problems.
12:17 egonw sure
12:17 egonw that's what it says... read that too
12:17 egonw the builder however, did not really show to be significantly faster
12:19 masak as long as the variable is local, you can use any which one you want, I guess. it's when it's a field or otherwise shared that it should be a StringBuffer.
12:19 egonw yes, I know
12:19 egonw I know how threading works...
12:19 egonw that was never the point
12:20 masak I assumed you knew. I'm just thinking out loud. :)
12:20 egonw ah, ok
12:30 edrin joined #bioclipse
12:46 egonw Gpox: ping
12:46 egonw olass: ping
12:46 Gpox egonw: pong
12:46 olass egonw: pong
12:47 egonw Gpox, olass: the CDK code was slow with parsing to SD file
12:47 egonw but...
12:47 egonw (stupid me)
12:47 egonw the mol table is *not* using the CDK code
12:47 egonw Gpox: and while you are using a StringBuilder
12:47 egonw not buffering the input when doing things char by char
12:47 egonw makes it horribly slow too
12:48 egonw Gpox: MoleculeTableManager lines 406-423
12:49 egonw but you are using the BufferedIS...
12:50 egonw Gpox: I can try to pinpoint where most time is used...
12:51 * egonw is annoyed he forgot that Bioclipse is using it's own SD file parser :(
12:52 egonw Gpox: assuming you have unit tests... does the MoleculeTableManager have unit tests?
12:52 Gpox it should be possible to rewrite it to read lines
12:52 Gpox no it dosen't
12:53 egonw I'll try to make it read lines...
12:53 egonw no, you better do that...
12:53 egonw I stumble on line 3...
12:53 egonw what is start??
12:55 Gpox the start of the properties block in the SDfile
12:56 egonw gonna leave this to you
13:00 Gpox egonw: but getProperties(...) is only run once in a separate job iirc
13:01 egonw well, it's not the parsing of the connection table that takes long...
13:02 Gpox it dose use cdk SD file parser
13:02 egonw where?
13:03 Gpox SDFIndexEditorModel.getMolecule
13:07 Gpox it should be possible to not pass it the properties section, the information is there to do that
13:07 egonw but the MDLV2000Reader is not reading the data block
13:08 egonw unless...
13:08 egonw ok, found it...
13:08 egonw another patch brewing...
13:09 egonw testing
13:10 egonw btw, no begging for google wave invites here?
13:10 egonw #bioclipse++
13:11 egonw Gpox: OK, problem fixed
13:14 CIA-51 bioclipse.cheminformatics: Egon Willighagen 2.0.x * r27cd147 / (2 files in 2 dirs): StringBuilder instead of += concatenation, boosting performance of SD file support - http://bit.ly/hVFrS
13:15 CIA-51 bioclipse.cheminformatics: Egon Willighagen master * rd075a6e / (2 files in 2 dirs): StringBuilder instead of += concatenation, boosting performance of SD file support - http://bit.ly/yrHVD
13:15 egonw olass, jonalv: please test
13:28 masak joined #bioclipse
13:28 mgerlich joined #bioclipse
13:28 stain joined #bioclipse
13:38 edrin egonw: did you know this: http://web.chemdoodle.com/overview.php ?
13:38 zarah edrin's link is also http://tinyurl.com/ybmxp9s
13:50 egonw edrin: yes
13:51 egonw samuell: reading material for you: http://www.biomedcentral.c​om/1471-2105/10?issue=S10
13:51 zarah egonw's link is also http://tinyurl.com/y8pma8m
13:52 samuell egonw: Thanks!
14:05 egonw samuell: btw, you might find this one interesting too: http://esw.w3.org/topic/HCLSIG
14:05 zarah egonw's link is also http://tinyurl.com/y8ugdvt
14:06 samuell egonw: Yep, added to bookmarks. thx.
14:09 CIA-51 bioclipse.cheminformatics: Egon Willighagen 2.0.x * r30bbc30 / plugins/net.bioclipse.cdk.ui/src/net/biocli​pse/cdk/ui/wizards/NewFromSMILESWizard.java : Added note that it saves in CML format (clarifies #1626) - http://bit.ly/3eMEcP
14:32 masak vim++
15:22 * egonw is updating bc2.2 with the latest CDK+JCP-Prim
15:23 egonw but running into a lot of trouble never seen with eclipse 3.4
15:25 olass egonw: your fix to SDF reader was a real performance boost!
15:25 olass egonw++
15:25 egonw and a few repeated refreshes makes all problems disappear like a summer in sweden
15:25 egonw olass: yes, it apparently was a big bottleneck
15:26 * egonw is happy that OS/X is such a crappy OS, that the bottleneck showed up :)
15:26 egonw hahahaha
15:26 olass hrmf
15:26 egonw seriously...
15:27 egonw for 2.4 we should plan a YourKit-on-Plugin-Unit-Tests session
15:27 olass yup
15:27 olass sounds like a plan
15:31 egonw olass: ping
15:31 egonw Gpox is not around...
15:31 olass egonw: I'd like you to close bugs reported as resolved fixed, for example 73, 794, 795, 1064, etc
15:31 olass egonw: pong
15:32 olass Gpox no
15:32 egonw there are a few compile errors resulting from the update I am about to do...
15:32 olass he left 15.30:ish
15:32 egonw but Gpox will fix those
15:32 egonw yes, I know
15:32 olass ok
15:32 egonw :)
15:32 olass so cdk will not compile from now until he fixes those?
15:32 egonw so, I will update, but that will leave the repos slightly broken...
15:32 olass fine with me
15:32 egonw no, just the JCP part
15:32 olass thx for the pointer
15:33 olass ok
15:33 egonw but he needs me to make this commit to proceed
15:33 egonw I'm sure he and I will resolve it tomorrow
15:33 olass yup, I know
15:33 olass [17:31] < olass> egonw: I'd like you to close bugs reported as resolved fixed, for example 73, 794, 795, 1064, etc
15:33 egonw I have a very long list...
15:33 olass egonw: they donät show up in your queries?
15:33 egonw I'll see if I can find some time tomorrow to do some admin stuff
15:33 olass would be appreciated, they litter my lists
15:34 egonw I skipped Hierta, btw
15:34 olass me too :(
15:34 olass no time
15:34 egonw could not find the energy to push me once more...
15:34 olass that's life
15:35 * samuell has to leave for a couple of hours. bbl
15:35 olass bye samuell
15:35 samuell bye
15:35 egonw bye
15:35 samuell left #bioclipse
15:39 CIA-51 bioclipse.cheminformatics: Egon Willighagen master * r5f0923d / (168 files in 99 dirs): Pushed in a new CDK 1.3.0.+ version plus updated JChemPaint-Primary - http://bit.ly/GMoL4
15:40 CIA-51 bioclipse.rdf: Egon Willighagen master * r5035b4c / plugins/net.bioclipse.rdf/src/net/bio​clipse/rdf/business/IRDFManager.java : Added API for downloading RDFa - http://bit.ly/ciAxu
15:40 CIA-51 bioclipse.rdf: Egon Willighagen master * rba4ee45 / plugins/net.bioclipse.rdf/src/net/bi​oclipse/rdf/business/RDFManager.java : Implemented a cheap importRDFa method, using the W3C webservice - http://bit.ly/11haz6
15:52 mgerlich joined #bioclipse
16:29 stain joined #bioclipse
16:41 edrin left #bioclipse
19:57 samuell joined #bioclipse
20:29 olass joined #bioclipse

| Channels | #bioclipse index | Today | | Search | Google Search | Plain-Text | summary