Camelia, the Perl 6 bug

IRC log for #cdk, 2011-04-18

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
03:51 egonw joined #cdk
04:41 egonw left #cdk
04:46 egonw joined #cdk
04:51 egonw left #cdk
05:06 alchimis1e left #cdk
05:10 alchimiste joined #cdk
05:13 egonw joined #cdk
05:20 egonw_ joined #cdk
05:24 egonw left #cdk
05:31 egonw_ moin
05:31 zarah ni hao egonw_
05:47 egonw_ left #cdk
05:59 egonw_ joined #cdk
06:01 egonw__ joined #cdk
06:05 egonw_ left #cdk
06:10 egonw joined #cdk
06:14 egonw__ left #cdk
06:23 egonw left #cdk
06:28 jbrefort joined #cdk
06:31 egonw joined #cdk
06:35 egonw left #cdk
06:41 egonw joined #cdk
06:50 egonw left #cdk
06:51 egonw joined #cdk
06:53 egonw left #cdk
06:53 egonw joined #cdk
07:03 Gpox joined #cdk
07:16 mgerlich joined #cdk
07:18 sneumann joined #cdk
07:31 egonw left #cdk
07:36 s_wolf joined #cdk
09:04 egonw joined #cdk
09:04 egonw left #cdk
09:05 egonw joined #cdk
13:12 maclean joined #cdk
13:23 egonw hi maclean
13:23 maclean hej egonw
13:23 egonw you had a question?
13:24 maclean Ah, sorry yes. About whether OSPIN could extract the R/S from a IUPAC name.
13:25 maclean After some digging in the code, it seems not.
13:26 egonw oh, but it does
13:26 egonw this is the route I want to take too, inf act
13:26 maclean Well, you can get the molecule as CML
13:26 egonw and I have discussed this approach with Daniel
13:26 maclean butttttt...
13:26 egonw yes, and that will have the chirality
13:27 maclean The chirality or the atomParity?
13:27 egonw atomParity...
13:27 egonw but it just happens to match R,S
13:27 maclean Oh?
13:27 egonw that is implementation detail I discussed with Daniel :)
13:27 maclean Ah.
13:27 egonw now, I don't remember if + == R or the other way around
13:27 maclean Hehe.
13:27 maclean There is a StereoHandler class, that seems to implement CIP-like rules.
13:28 egonw yes, he has a CIP implementation too
13:28 egonw I hope to get on to this next week
13:29 maclean Ok, well you're welcome to use the WedgeStereo stuff then, if you like. Might be useful, I suppose.
13:30 maclean I currently have a comparison of ChEBI R/S from a proprietry (sp?) toolkit vs CDK. 500 examples.
13:30 maclean Sorry, 5,000.
13:30 egonw and?
13:31 maclean 12% of centers are mis-assigned, and 40% of molecules have at least one mis-assignment.
13:31 egonw ouch
13:31 egonw that hurts
13:31 maclean One has 13 (ChEBI:26828)
13:32 maclean Well, some of that may be R-groups.
13:33 maclean Also, some are better than others. Only 4 mis-assigned in this monster : http://www.ebi.ac.uk/chebi​/searchId.do?chebiId=59581
13:34 maclean But 6/11 in this : http://www.ebi.ac.uk/chebi/advanc​edSearchFT.do?searchString=44230
13:35 egonw I have a suspicion on why it may fail
13:36 egonw I think this is R,S resolution based on stereochemistry of side chains kicking in...
13:36 egonw what are you using as input?
13:37 maclean Input? The ChEBI sdf file.
13:38 egonw 3D coords, I guess
13:38 maclean A mix of the two.
13:38 egonw have you tried starting with the chiral SMILES?
13:39 maclean No... what difference would you expect?
13:39 maclean I mean, smiles has @/@@, but.
13:40 egonw well, I guess I am just hoping that score will do better :)
13:41 maclean Ah, right.
13:41 egonw but I got no constructive argument for that
13:41 maclean Well, it might eliminate one possible source of error - the assignment of CW/ACW based on wedges.
13:42 egonw wedges?
13:42 egonw using your tool
13:42 maclean Stereo wedges. Yes
13:43 egonw eliminating that step, we'd kind of bisect the problem
13:43 maclean true.
13:44 maclean But the ligand ordering based on ligand-stereocenters problem you mentioned is another thing.
13:44 egonw yes...
13:44 egonw that's designed...
13:45 egonw that would be a very major rewrite of the code
13:45 egonw I was hoping that those cases would be rare...
13:45 egonw apparently not... :/
13:45 maclean I don't know how common they are actually. Certainly cases where CDK makes no assignment are clear for this.
13:46 egonw indeed
13:46 maclean Eg : http://www.ebi.ac.uk/chebi/sea​rchId.do?chebiId=CHEBI%3A31170
13:47 maclean where the two bridgehead atoms are misassigned, but the C(CCO) atom is not assigned at all.
13:48 maclean Anyway, there are lots of examples! :) For next week, as you say. Maybe I'll have fixed the code by then...
13:48 egonw the bridgehead atoms are misassigned... mmm...
13:48 egonw what are the expected R,S values? different, right?
13:48 maclean Opposite.
13:49 maclean R->S, S->R.
13:49 maclean Do I mean "bridgehead" here? The ones at the thin ends of the up wedges.
13:49 egonw yes, those are bridgeheads
13:50 egonw OK, the top carbon should be R, right?
13:50 egonw damn CIP rules...
13:50 egonw one sec...
13:51 egonw yes, should be R
13:51 egonw bottom one should be S
13:51 maclean Correct!
13:52 s_wolf left #cdk
13:52 egonw but my CIP code says otherwise?
13:53 maclean Yes, it says the opposite. But I'll need to bisect-check, as we talked about.
13:54 egonw funny thing is... I don't even think the SMILES is more useful here...
13:54 egonw the SMILES is only a graph... and would not be able to say which of the two belongs where...
13:55 egonw because it could easily make the geometry fucked up, with enormously long bonds
13:55 maclean Hmm. Yes, without 2D coords, you can't get the CW/ACW to project the CIP ordering onto.
13:56 s_wolf joined #cdk
14:01 egonw I think first we need good unit tests now... for common problems... this last example is a nice one, simple, one that one can manually verify
14:03 egonw I'm quite interested in your InChIReader...
14:03 egonw we had one before, but that one never got updated for newer InChI versions...
14:03 egonw did I understand correctly that it also parses the AuxInfo stuff?
14:04 egonw i.e. the original labeling?
14:04 maclean It tries to make a molecule graph that has the same labelling, yes.
14:05 maclean This is essential for getting the (+)/(-) descriptor refs. Sadly, those are unrelated to R/S.
14:05 egonw indeed
14:05 egonw but is the AuxInfo used?
14:05 egonw or do I need to do a graph matching at the end?
14:05 egonw use case:
14:06 egonw for a certain IAtomContainer, get me the InChI atom numbering
14:07 maclean No, it should be possible to get the InChI atom numbering (labelling) without an isomorphism.
14:08 egonw excellent...
14:08 maclean That is, return a permutation p where p[i] = j for atoms i in the original mol, and j in the inchi.
14:08 egonw I need to look at that...
14:08 egonw oh, crap...
14:08 egonw we really, really must have a CDK workshop soon
14:08 egonw a developers workshop that is
14:09 maclean Right, yes.
14:11 egonw I'll convert that class into a patch then :)
14:13 maclean Cool, thanks. I'll try to add parsing of other layers at some point.
14:14 egonw right
14:14 egonw it has to start somewhere
14:14 maclean Well it started with Mark's code :) He should be on the author list. One of the methods is his.
14:15 egonw OK, please comment that in an email
14:15 egonw in reply to the .java file
14:15 egonw that should be a line in the copyright header
14:15 maclean Ok.
14:22 maclean got to go. see you later.
14:23 maclean left #cdk
14:27 egonw left #cdk
14:50 egonw joined #cdk
15:11 egonw_ joined #cdk
15:12 egonw__ joined #cdk
15:14 egonw left #cdk
15:15 egonw_ left #cdk
15:19 egonw joined #cdk
15:20 egonw__ left #cdk
15:21 CIA-121 cdk: andreas1981 * r15596 /cdk-taverna-2-paper/bmc_article.tex: - Changed slightly some parts.
15:29 egonw_ joined #cdk
15:33 egonw left #cdk
15:34 egonw__ joined #cdk
15:34 egonw_ left #cdk
15:45 egonw__ is now known as egonw
15:53 egonw left #cdk
15:57 Gpoks joined #cdk
16:00 Gpox left #cdk
16:01 Gpoks is now known as Gpox
17:03 jbrefort left #cdk
17:03 egonw joined #cdk
19:46 jbrefort joined #cdk
20:47 egonw left #cdk
21:07 jbrefort left #cdk
23:55 Gpox left #cdk

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary