Camelia, the Perl 6 bug

IRC log for #cdk, 2011-07-23

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:48 egonw left #cdk
06:00 jbrefort joined #cdk
07:54 egonw joined #cdk
08:46 sneumann__ joined #cdk
09:20 sneumann__ left #cdk
10:09 CIA-90 cdk: maclean 311-14x-gilleainATs * r105850e / (3 files in 3 dirs):
10:09 CIA-90 cdk: Mn final
10:09 CIA-90 cdk: Signed-off-by: Egon Willighagen <egonw@users.sourceforge.net> - https://github.com/cdk/cdk/commit/105​850ec67ea099534b1ae3bf27875a401ede7de
10:09 CIA-90 cdk: Egon Willighagen 311-14x-gilleainATs * r229f036 / src/test/org/openscience/cdk/atom​type/CDKAtomTypeMatcherTest.java :
10:09 CIA-90 cdk: Moved countTestedAtomTypes() to the end, to fix the checking if all atom types
10:59 maclean joined #cdk
10:59 egonw hi maclean
10:59 maclean hi egonw
11:00 egonw I'm working may way through patches this morning
11:00 maclean cool, thankyou.
11:00 maclean I saw the Mn commit.
11:00 egonw yeah, premature and wrong branch :)
11:00 maclean oh. heh
11:00 egonw I am 'rewriting' your patches a bit...
11:00 egonw they don't apply cleanly...
11:00 maclean :(
11:00 egonw not that I make code changes
11:01 egonw but one issue was that countAtomTypes()
11:01 egonw needs to be the last method
11:01 maclean ??
11:01 maclean method order is relevant?
11:01 egonw and you nicely placed all your tests at the end of the class :)
11:01 egonw yeah, unfortunately :(
11:01 egonw I haven't figured out a solution for that
11:02 egonw anyways... it keeps mixing up others of your patches with the latest one I am applying
11:02 egonw so keep copy/pasting code :)
11:02 maclean what mechanism goes wrong if it's not the last method?
11:02 egonw but that's OK, as I I combine this with looking at the code :)
11:02 egonw the check if all atom types defined in the .xml are tested by unit tests...
11:02 maclean well, sure. but I was trying to avoid you having to copy/paste...
11:03 egonw JUnit has @AfterClass, but that is a static method
11:03 egonw while the list of tested atom types is not a static field
11:03 maclean There's no @After?
11:03 egonw yeah, but that gets runs after each test
11:03 maclean Oh, right, yes.
11:03 maclean Hmm.
11:04 egonw OK, let me commit what I have right now
11:04 maclean ok
11:04 CIA-90 cdk: Egon Willighagen cdk-1.4.x * r3d4f7a5 / src/main/org/openscience/cdk/at​omtype/CDKAtomTypeMatcher.java : Hooked in iodine atom type detection (+8 more commits...) - https://github.com/cdk/cdk/commit/3d4​f7a570d6ea0ed01e7d1cbf8ca37ecc787e69b
11:05 egonw I'll also send the matching email :)
11:05 maclean ok, great.
11:05 egonw sent
11:05 egonw 5 ATs are in now
11:05 egonw 41 to go :)
11:06 maclean by the way, do you think it makes sense to have a slightly finer-grain class structure for the CDKAtomTypeMatcher?
11:06 egonw I'm very open to ideas for that
11:06 maclean Like, MetalMatcher, er...TransuraniumMatcher...
11:06 maclean 'NormalElementMatcher (CNOPS...)
11:06 egonw e.g. I am also wondering how we can tune performance for this thing
11:06 egonw improve...
11:07 egonw the class must be super fast
11:07 maclean True.
11:07 maclean and it is very hard to add stuff, as these giant if/else towers are hard to read.
11:08 maclean But then, a more 'flexible' solution may be slower :(
11:09 egonw Martin Ott (now at Lhasa) has a cool system...
11:09 egonw set based
11:09 egonw he defined sets for all sorts of things: sp2 atoms
11:10 egonw a set for atoms with 1 double bond (etc)
11:10 maclean Right.
11:10 egonw and then atom types match a particular subset of sets
11:10 maclean Ok, I see.
11:10 egonw sorry... more accurate
11:10 egonw an atom type is a combination of certain set definitions
11:10 maclean Yes.
11:11 egonw atoms in such a sub set is are of that atom type
11:11 maclean that's is much closer to the more 'compsci' approach that I suspect would be better.
11:11 egonw but then too, you have to figure out what the most discriminative sets are
11:12 maclean Yes, but essentially this is just a formalization of what we are doing anyway.
11:12 egonw true
11:12 maclean (I think)
11:12 egonw yeah, I think so too...
11:12 maclean So, exposing possible inconsistencies seems worthwhile.
11:12 maclean But tricky :)
11:13 egonw yes, one big difference is that we have multiple routes now to the same atom type...
11:13 egonw but I guess that applies to that approach too
11:13 egonw the problem is to address missing information
11:13 maclean http://chem-bla-ics.blogspot.com​/2007/07/atom-typing-in-cdk.html I guess.
11:14 maclean That guy mentions Ott's Fortran code. Hmmm.
11:15 egonw dunno if I still have that
11:16 egonw but we do have Mol2 (aka Sybyl) atom type perception now
11:16 egonw well... now -> for a long time now
11:16 maclean Indeed.
11:16 maclean About calcium.
11:17 maclean What do you mean by the history is different?
11:17 egonw nothing special...
11:17 egonw it's just that the patch no longer applies
11:17 egonw and the merging was too difficult for me to work out just now
11:17 maclean Oh. Weird though. Essentially all the patches were made in the same way - I would think that they have the same history.
11:17 egonw so, history -> latest version of what I have
11:18 egonw yeah, it's the other AT patches and my move of countAtomTypes() to the end that skrew things up
11:18 maclean Oh IC.
11:19 maclean I'll ask nimish about chlorine, but it's probably a kegg erorr.
11:19 egonw don't think KEGG has hybrid info
11:19 egonw oh, btw, something on G+ wondered why that Na- KEGG entry was wrong :)
11:20 maclean Oh, ok. But I mean that the structure is probably like that Na- one and we've tried to fit chemistry around it...
11:20 egonw yeah, could be
11:20 egonw let's check molbase
11:20 maclean Probably just someone drawing it wrong, there are many drawing errors in the structures.
11:21 egonw chlorine[ClCl2]+
11:21 egonw chlorine[ClCl2]-
11:21 egonw chlorine[ClF2]-
11:21 egonw chlorine[ClFO]
11:21 egonw chlorine[ClOO]-
11:21 egonw Cl + 2 neighbors
11:22 maclean http://www.genome.jp/dbget-bin/www_bget?cpd:C01486 ?
11:23 egonw could match this one: http://winter.group.shef.ac.uk​/molbase/compound.html?id=1800
11:23 maclean right, so 2 lp
11:24 maclean tetrahedral = sp3 ?
11:24 egonw not necessarily
11:24 egonw sp3 gives tetrahedral
11:24 egonw but 4 orbitals cannot host 12 valence electrons
11:24 egonw molbase gives s2p2
11:25 egonw but that's only 4 orbs too
11:26 egonw but I think that 12 is wrong...
11:26 egonw that's I can only count 10 val electrons
11:26 egonw 4 in LPs
11:26 egonw 4 in the double bond to one oxygen
11:26 egonw 2 in the sigma bond to the negatively charged oxygen
11:29 maclean hmmm. I can't say I totally follow all this...
11:30 maclean mol base is interesting though.
11:30 maclean curated atom types, in a way.
11:31 egonw indeed
11:31 maclean I would quite like to be able to search by VE configuration though.
11:31 egonw Br is the same as Cl...
11:32 egonw improper hybrid
11:32 egonw that one actually even does seem to have 12 valence electrons
11:35 maclean Heh. One of the chem department members is called "John Dalton" I guess you would have to go into chemistry with a name like that :)
11:39 egonw brb
11:39 maclean ok
12:18 CIA-90 cdk: maclean cdk-1.4.x * r2641484 / (3 files in 3 dirs):
12:18 CIA-90 cdk: Li final
12:18 CIA-90 cdk: Signed-off-by: Egon Willighagen <egonw@users.sourceforge.net> - https://github.com/cdk/cdk/commit/264​1484bb985b079e4df7a7be5125d82a4b830a0
12:18 CIA-90 cdk: maclean cdk-1.4.x * r8bae588 / (3 files in 3 dirs):
12:18 CIA-90 cdk: K final
12:19 egonw maclean: OK, enough for now
12:19 egonw I must write some stuff for my book now
12:19 maclean fair enough :)
12:20 egonw and then for our course book
12:20 maclean phew lot to do!
12:20 egonw feel free to rebase patches that remain...
12:20 egonw I have not looked at any new one yet, and don't mind at all those being sequential
12:21 egonw but don't mind either if you do not touch does patches :)
12:21 egonw thanx for the hard work
12:21 egonw these patches are nice and clean
12:21 egonw easy to review
12:21 maclean ok, am probably going to do some smsd patches.
12:21 maclean no problem, thanks for reviewing.
12:21 egonw and my energy now goes more into the actual code, rather than the cosmetics
12:21 maclean indeed.
12:21 maclean bye
12:22 maclean left #cdk
13:04 CIA-90 cdk: Egon Willighagen cdk-1.4.x * rcbb3be4 / build.props : Release 1.4.1 - https://github.com/cdk/cdk/commit/cbb​3be4269b0fc39339696d83aec0476f7f1a6ee
13:15 sneumann__ joined #cdk
13:21 sneumann__ left #cdk
13:43 sneumann__ joined #cdk
14:57 sneumann__ left #cdk
15:23 egonw left #cdk
16:14 egonw joined #cdk
16:47 fusulbashi joined #cdk
16:47 fusulbashi left #cdk
16:52 maclean joined #cdk
16:59 maclean egonw : was that clear about morgan numbers?
17:00 egonw dunno... let me read what you wrote :)
17:00 maclean :) fair enough
17:01 egonw ah... that explains why there is a method that takes element into account...
17:01 maclean oh, right getMorganNumbersWithElementSymbols
17:02 egonw well, not that really solves things...
17:02 egonw but I'm wondering how TF this is used to get canonical SMILES then...
17:02 maclean it isn't
17:02 maclean the smiles algorithm is more complex
17:03 egonw mmm... it just seeds with morgan numbers then?
17:03 maclean uhhh. I had a link here somewhere...
17:03 maclean http://www.ra.cs.uni-tuebingen.de/softwar​e/joelib/tutorial/algorithms/Morgan.html
17:03 zarah maclean's link is also http://tinyurl.com/3f85lt8
17:04 egonw well, I think I'll read the original paper again then...
17:04 maclean it basically describes two algorithms
17:04 maclean one is the one in MorganNumberTools
17:04 maclean the other is the one in CanonicalLabeler.
17:04 maclean llllllll
17:05 egonw yeah, I think I get the picture
17:06 maclean cool g2g
17:06 maclean left #cdk
18:50 sneumann__ joined #cdk
19:48 sneumann joined #cdk
19:51 sneumann__ left #cdk
21:00 jbrefort left #cdk
21:17 egonw left #cdk
21:29 sneumann left #cdk
22:37 slyrus_ joined #cdk
22:38 slyrus left #cdk
22:38 slyrus_ is now known as slyrus

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary