Camelia, the Perl 6 bug

IRC log for #cdk, 2011-07-15

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
05:16 sneumann__ joined #cdk
06:00 sneumann__ left #cdk
07:10 sneumann__ joined #cdk
07:32 jbrefort joined #cdk
07:32 egonw_ joined #cdk
07:44 jbrefort left #cdk
07:44 jbrefort joined #cdk
08:20 egonw_ left #cdk
08:30 egonw_ joined #cdk
09:03 maclean joined #cdk
09:03 maclean morning
09:06 sneumann__ moin
09:06 zarah privet sneumann__
10:12 maclean left #cdk
10:15 egonw_ moin moin
10:30 maclean joined #cdk
10:38 egonw_ hi maclean
10:38 maclean hi egonw
10:38 egonw_ I just back only around 12:00...
10:39 maclean You're only back until 12?
10:39 maclean Or...
10:39 egonw_ 12 CEST :)
10:39 egonw_ is now known as egonw
10:40 maclean Ah.
10:40 maclean I think I have a workflow for the first part of the great atomtype patch migration.
10:41 egonw maclean++
10:41 egonw very, very much appreciated!
10:41 egonw I will try to write up a 'CDK Patch Writing HowTo' these holidays...
10:41 maclean Heh.
10:42 maclean Perhaps it could be a book chapter :)
10:43 egonw yeah, probably I'll do that at some point
10:45 maclean It was good to see molecule images in the new draft. Good inchi section too.
10:46 egonw thanx
10:46 egonw yeah, all those images are dynamically generated too :)
10:47 egonw to my regret, I learned this week that some things are not part of renderbaisc
10:47 egonw one being implicit hydrogen rendering :/
10:47 egonw that should be in the BasicAtomRenderer...
10:47 maclean Ah.
10:47 egonw Gpox, maclean: or was there a practical reason it could not be in that one?
10:48 maclean Heh, don't ask me - i didn't break them up into modules...
10:48 egonw I have a figure about CH4 versus H-C(-H)(-H)-H
10:48 egonw but just as C in the figure, will only confuse the reader :(
10:48 maclean Well indeed.
10:48 maclean Perhaps something about the atomtypesymbol element?
10:49 egonw well, it's not so much a module thing...
10:49 maclean No?
10:49 egonw i mean, the implicit H count is part of IAtom
10:49 egonw so could be in BasicAtomGeneration
10:49 egonw renderextra is more about requiring things we don't need for small molecules in most situations
10:49 egonw like reactions
10:50 egonw but also isotopes, which require a full isotopefactory
10:50 maclean AtomSymbolElement has implicit H count.
10:50 egonw unsuited for small applets
10:50 egonw indeed, but it's not used!
10:50 egonw the code for doing implicit hydrogens is in ... umm...
10:50 egonw ExtendedAtomRenderer...
10:51 egonw mm... what's the name...
10:52 maclean Can't recall. But anyway, there doesn't seem to be any technical reason not to have it in basic.
10:53 egonw ah, I got it right :)
10:53 egonw http://pele.farmbio.uu.se/nightly-jcp/cdk​-javadoc-1.3.11/org/openscience/cdk/rende​rer/generators/ExtendedAtomGenerator.html
10:53 zarah egonw's link is also http://tinyurl.com/6jevwt6
10:53 egonw indeed... so, I'll talk with Gpox what he thinks, and try to port that code into the BasicAtomGenerator
10:55 maclean makes sense to me.
11:13 maclean Ok, Fe, Co, Pb, and Se are now in separate (intermediate) patches. Now for a tricky one - P...
11:23 maclean P.irane == Phosphirane?
11:23 maclean (doesn't exist...)
11:24 egonw mom...
11:24 egonw InChI=1/C2H5P/c1-3-2/h1H2,2H3
11:25 maclean ahh. takk.
11:25 egonw I tried to add @cdk.inchi to all unit tests in CDKAtomTypeMatcherTest
11:26 maclean Hmmm. 2-Phosphapropene
11:26 maclean Oh, right, yes. The inchi is right there on my screen :)
11:26 maclean There's a " FIXME: compare with previous test... can't both be P.ine..."
11:27 egonw yeah, saw that too... that seems outdated...
11:27 egonw someone (me?) forgot to remove that, I guess
11:27 maclean I'm removing it.
11:28 egonw separate patch?
11:29 egonw (git stash; nano $file; git commit -m "Removed outdated FIXME" $file; git format-patch -1; git stash apply)
11:30 maclean whatever. it's too early to be doing that.
11:30 maclean weird. there's no P.1 or P.32
11:33 maclean Well, effectively it will be a separate patch, as your recent patches have pre-super-ceeded it.
11:34 maclean Sorry, that was terribly unclear. Never mind.
11:35 egonw yeah, I started that on Tuesday before Asad's patches came in...
11:38 maclean Hmmm. P#C
11:38 maclean or H2P#CH
11:40 egonw yeah, that's another reason why I don't like MDL molfiles...
11:40 egonw they have no room for implicit H counts
11:41 maclean Hmmm. ChemSpider says P#CH : http://www.chemspider.com/RecordView.aspx​?rid=a15f38eb-ac38-41d7-b417-6ea86cee416d
11:41 zarah maclean's link is also http://tinyurl.com/6drtbns
11:41 egonw P is below N, right?
11:41 egonw with one lone pair, this atom type makes sort of sense
11:42 egonw but like with S, P has the nasty habit of a low lying d orbital
11:42 maclean wikipedia says "It is thus the phosphorus analogue of http://en.wikipedia.org/wiki/Hydrogen_cyanide"
11:42 egonw that easily hybridizes with s and p orbitals, resulting in weird chemistry
11:42 egonw right
11:45 maclean Ahhh. This already exists, and is called P.ide
11:45 egonw did you see that the appendix in the PDF now in fact includes the atom type names?
11:46 maclean You might even say that this is why a separate patch per atom type was a good idea...
11:46 maclean I did, yes, thankou. I may even print that out to help today.
11:46 egonw oh, wait
11:46 egonw let me rebuild it then with the Julio's P atom types
12:00 maclean Hmmm. P.ine seems to be perceived in two diff places.
12:01 egonw yeah, possible...
12:01 egonw the current approach is a bit verbose
12:01 egonw but it has to overcome various bits of missing information
12:01 egonw like unknown H count
12:01 egonw or unknown bond order :/
12:01 egonw welcome to cheminformatics
12:02 maclean Well, I would say "welcome to comp-sci", but I'm unable to...
12:02 maclean It's not as if this kind of typing under partial information hasn't been studied elsewhere (I think)
12:03 maclean so P.ine 'means' 0 or 3 neighbours, and no charge.
12:04 maclean Annnnywaay. Onward!
12:04 egonw no, that's not what it should mean...
12:05 egonw P.ine == 3 neigbors... but 3 may be implicit in the graph?
12:06 maclean line 1180
12:06 maclean if (neighbourCount == 0 && formalCharge == 0: type = P.ine
12:06 maclean (in python-java-esque)
12:07 maclean line 1201 : if (charge != 1 and doubleBonds != 1) : type = P.ine
12:08 egonw ok, different lines here, I guess :)
12:09 egonw ah, yeah, that's the clause for PH3
12:09 egonw so, no explicit neighbors, no or zero charge
12:09 egonw line 929 here :)
12:10 maclean Oh, right.
12:11 maclean Sorry, part-way through adding to that file.
12:12 maclean Note that I am moving some elements from "perceiveCommonSalts" to "perceiveXXX" for element XXX.
12:12 maclean Such as Pt.
12:13 egonw yeah, that makes a lot of sense to me
12:13 egonw maclean++
12:18 maclean left #cdk
13:04 stain left #cdk
13:05 stain joined #cdk
13:27 mgerlich left #cdk
14:46 sneumann__ left #cdk
16:14 maclean joined #cdk
16:26 egonw hi maclean
16:26 maclean hi
16:26 zarah oh hai maclean
16:27 maclean http://www.genome.jp/dbget-bin/www_bget?cpd:C12666
16:27 maclean Na[-] hmmm
16:28 egonw hahahaha
16:28 maclean I think I probably won't add that one...
16:29 egonw no, let's not
16:29 egonw I also noted [Ca2+] covalently bound atom type
16:29 egonw or something like that
16:29 egonw please don't add that either
16:29 maclean Yeah, that's a problem actaully.
16:30 maclean KEGG seems to draw bonds to Ca atoms when they are coordinated by dative bonds.
16:30 egonw KEGG is know to be a bit unclean...
16:30 egonw though this Na- surprises me :)
16:30 egonw yes, but MDL molfiles does not have dative bonds ... yada, yada
16:31 egonw that does not mean the CDK should accept it
16:31 egonw instead... that would be very wrong
16:31 egonw if such things are common for a particular database (KEGG is most certainly not alone), a normalization step should be performed before atom typing
16:32 egonw I recently had a SD file where more than half of the entries had a failing N atom type...
16:32 egonw a nice, uncharge, four-coordinate nitrogen :)
16:32 maclean Hmmm.
16:32 egonw prestep: add missing charge
16:35 * maclean is up to patch 23 : Hg.
16:36 egonw please do submit a first few?
16:37 maclean Well these are only pre-patches, really.
16:37 maclean I still have to do the code-generation step to convert the molfiles into cdk.data calls.
16:37 egonw ah, OK
16:38 maclean I'll be halfway there with Ca (the next one).
16:48 maclean going for a break..
16:48 maclean left #cdk
20:15 egonw left #cdk
20:54 jbrefort left #cdk
21:28 Gpox left #cdk

| Channels | #cdk index | Today | | Search | Google Search | Plain-Text | summary