Camelia, the Perl 6 bug

IRC log for #metacpan, 2011-06-12

| Channels | #metacpan index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:01 hoelzro joined #metacpan
01:18 klapperl joined #metacpan
01:23 klapperl_ left #metacpan
01:27 woldrich joined #metacpan
02:36 garu joined #metacpan
02:38 oalders mo: re: http://i.imgur.com/kmPZm.png looks great! (and good luck with your move)
02:39 oalders punytan, mo: /pod/{module} did return JSON under the old API.
02:40 oalders pod was a type in ElasticSearch, so any searches on this URL were direct searches on ES
02:40 oalders with the new version mo has set up some convenience URLs.  /pod is now a wrapper around ES, so it doesn't wrap the response in JSON
02:41 oalders so, it's a different response, but it's actually organized better now
02:41 oalders sorry about the breakage. we'll have a better plan for versioning going forward :)
02:48 woldrich perhaps it's just me, but I find the greyed out description very hard to read.
02:49 confound is there no rss feed of http://beta.metacpan.org/recent ?
02:49 dipsy [ Search the CPAN - beta.metacpan.org ]
02:52 confound actually I guess there aren't rss feeds in general
02:53 woldrich confound: https://github.com/CPAN-API/metacpan​-web/issues/76#issuecomment-1353525
02:53 dipsy [ #76: RSS Feed of Recent Uploads - Issues - CPAN-API/metacpan-web - GitHub ]
02:53 confound thanks
02:54 autarch I think the font color is less of an issue than the fact that it's just a jumble of text
02:55 autarch I think it'd be a lot more useful to extract the first few sentences from the DESCRIPTION section, and maybe fall back to the first section with text?
02:55 autarch but I do think it could be a bit darker
02:55 woldrich yes, it's also mentioned in an issue
02:55 autarch ah
02:55 autarch although what we _really_ need is better result ranking
02:55 autarch the first is good, but the rest seem essentially arbitrary
02:56 autarch I think that'll have to wait for more data to work with though
03:00 alnewkirk left #metacpan
03:24 oalders autarch, woldrich:  http://i.imgur.com/kmPZm.png
03:25 oalders but yes, result ranking is a big part of the GSoC plan
03:25 autarch what am I looking at?
03:25 oalders what mo is working on for search result layouts
03:25 autarch oalders: yeah, that's what i was commenting on earlier
03:25 autarch I was thinking about ranking, and I think there's two aspects of it
03:25 autarch one is determining whether something is relevant to the search, the other is sorting relevant results
03:26 autarch I think focusing on the relevance first is more important than sorting for the moment
03:26 autarch the Moose example is a good one, since some of the results are totally irrelevant, like Mason::Moose and MARC::Moose
03:27 autarch I think there's some weighting that could be done, though I'm not sure if this needs to happen at the API end
03:27 autarch like if the word being searched for is in the distro name, that should rank very highly
03:27 autarch if the word is in _most_ of the modules in a distro, it should also rank highly (so any DateTime:: distro ranks highly for DateTime)
03:28 autarch by contrast, if the word is in just one module out of many (like Mason::Moose) it should rank lowly
03:28 autarch all of this should probably take precedence over the use of the word in the POD itself
03:29 oalders right. i'm wondering if these ideas should have a home in the wiki
03:29 autarch I can put them there if that's useful
03:29 autarch we may also need an explicit disassociation mechanism
03:29 autarch I'm realizing that MARC::Moose is the name of the distro, but it's a bad name and should not rank highly for Moose searches, regardless of what I've just said ;)
03:29 oalders :)
03:30 oalders there will be some up and downvoting of modules which will help weight results
03:31 autarch voting on the result quality, or the distro quality?
03:31 autarch cause the issue here is that the result isn't relevant, not that the module is good or bad
03:31 oalders distro quality.  i see what you mean. it would also be helpful to flag whether a dist even belongs in a set of search results
03:31 autarch right
03:32 autarch distro quality is for the sorting aspect
03:32 autarch but sorting is only useful if the results are relevant, otherwise you're sorting an essentially arbitrary list
03:34 autarch where's the wiki?
03:35 oalders https://github.com/CPAN-API/cpan-api/wiki
03:35 dipsy [ What is MetaCPAN? - GitHub ]
03:35 autarch aha, I was in the wrong repo
03:35 oalders oops
03:35 oalders you were in the right repo
03:36 oalders https://github.com/CPAN-API/metacpan-web/wiki
03:36 dipsy [ CPAN-API/metacpan-web - GitHub ]
03:36 oalders i *think* the scoring is more on the front end
03:36 autarch I can't see the wiki or something
03:36 autarch well, it's hard to see how the frontend could do this efficiently
03:36 oalders wow. i guess it hasn't been created yet.
03:37 oalders how so?
03:37 autarch is it going to pull _every_ result, then filter, than show the first 10?
03:37 autarch then do that again for every page of results?
03:37 oalders no
03:37 oalders i believe in the es request you can assign more or less weight to terms etc
03:37 oalders adjust your scoring that way
03:37 autarch I see
03:38 autarch but the thing I'm talking about where you give more weight if all the modules have a word, or explicitly derank something
03:38 autarch that probably needs backend support
03:39 oalders it may well.  you know, this can probably go into the wiki i initially pointed you at. i'm not sure having more than one wiki is even helpful
03:40 autarch ok
03:40 autarch should I make a new page?
03:40 woldrich oalders: yes that's what I, as well, commented on. It looks nice, except the font color that's really hard to read for me. But that might just be my old grumpy eyes/my old monitors/imagination
03:40 apeiron another module that's not indexed, though this time I'm not at all bothered: common::sense
03:41 oalders woldrich: oh, ok. i have to be better about reading the logs :)  i think that font colour is hard to read as well
03:41 autarch I do like the "related modules" bit that mo did
03:41 oalders apeiron: common::sense is already on the radar :)
03:42 oalders yeah, the related modules is helpful. rather than stacking the results with lots of modules in the same dist
03:42 apeiron As I said, I'm not at all bothered that Marc "I want to fork perl5 for Unicode support" Lehmann's code isn't indexed.
03:43 autarch oalders: yeah, having the same dist show up more than once was unhelpful, this is _way_ better
03:43 oalders apeiron: there's no .pm in that dist
03:43 apeiron oalders, indeed
03:44 oalders i guess it gets built by sense.pm.PL
03:44 apeiron nod
03:44 oalders so, that's an edge case
03:44 apeiron SCO picks it up somehow.
03:45 oalders that's one thing about SCO. there has been a lot of time to work on the weird little things.  we're just trying to figure all of that out
03:45 oalders unfortunately without the knowledge that's in SCO
03:45 apeiron indeed.
03:45 oalders but at least now everyone has access to it :)
03:49 autarch yeah, I think this has the potential to catch up to and surpass SCO _very_ quickly
03:50 * apeiron has stopped using SCO
03:51 oalders which reminds me, we should point the dist downloads to our own CPAN. would give us an idea of download numbers
03:51 woldrich why is SCO closed source in the first place? if there's a short version of the history somewhere.
03:53 * oalders has no idea
03:55 autarch https://github.com/CPAN-API/cpan-api/wiki/Ideas
03:55 dipsy [ Ideas - GitHub ]
04:56 hoelzro|laptop joined #metacpan
05:05 hoelzro|laptop what properties do you folks think an annotation should have?
05:06 hoelzro|laptop currently, I'm thinking module, section, text
05:07 oalders maybe the type of annotation
05:07 oalders like a POD correction, POD addition, comment
05:07 oalders for instance, if it's just a comment, you wouldn't want to create a patch out of it
05:09 hoelzro|laptop mhmm
05:09 hoelzro|laptop well, I was also thinking of starting simple; leaving the "create patch out of this annotation" functionality for later
05:09 oalders for sure
05:10 oalders but you *may* want to consider what kind of annotation it is
05:10 hoelzro|laptop I'd like to get the initial revision out there as fast as I can, and then make iterative improvements upon it
05:10 oalders but keeping it simple is good too
05:10 hoelzro|laptop I suppose it couldn't hurt to put it in the document class
05:10 oalders see the use cases and see if it makes sense to add further properties
05:11 hoelzro|laptop well, I took some notes from the initial conversation we had about it
05:11 hoelzro|laptop I think that pull requests/patches are the biggest piece of data so far
05:12 hoelzro|laptop so...when making an ElasticSearchX::Model::Document, am I restricted to the kinds of types my attributes can have?
05:12 hoelzro|laptop (ie. can I use enum)
05:14 oalders i haven't looked closely at the types in ElasticSearchX::Model
05:14 oalders that's mo's work.  i believe he will release it to cpan eventually
05:15 hoelzro|laptop he should; it's pretty cool!
05:15 oalders it is :)
05:15 oalders i've only looked at it briefly
05:16 oalders i wrote the original parsing code. then mo changed a lot of it and i'm not quite up to speed
05:16 hoelzro|laptop ah
06:05 mo hoelzro|laptop: restricted to what types?
06:06 hoelzro|laptop mo: I'm just wondering what kind of types I can use for ElasticSearchX::Model::Document attributes
06:09 hoelzro|laptop for example, can I use enums?
06:10 mo those are just array aren't they?
06:10 hoelzro|laptop I think so
06:11 mo then just use ArrayRef
06:11 mo that'll work
06:11 hoelzro|laptop hmm
06:11 hoelzro|laptop well, actually, I don't think they are
06:11 hoelzro|laptop I mean, I don't know what they are internally
06:11 hoelzro|laptop probably an anonymous subtype of Str or something
06:11 hoelzro|laptop or maybe even Any
06:16 mo what does Enum do?
06:17 hoelzro|laptop mo: enum allows you to construct a type that only accepts a set of given values
06:17 hoelzro|laptop ex. has foo => (isa => enum([qw/bar baz/])) # foo can only be 'bar' or 'baz'
06:17 mo ah I see
06:17 mo yes those work
06:18 hoelzro|laptop ok, good =)
06:54 hoelzro|laptop left #metacpan
08:53 clintongormley left #metacpan
08:55 clintongormley joined #metacpan
09:01 clintongormley autarch: oalders: re the search ranking and the search clustering - it would be very helpful indeed to have some "rules" defined, which will allow us to model how we store the data and how we model the queries
09:02 clintongormley it definitely needs to be stored in the backend, with the frontend just contributing information (eg ranking)
09:02 clintongormley also: how do we present the search results?  What do we show if the user searches for Moose?  What if he searches for Moose::Util?
09:04 clintongormley hoelzro: re data types in ES, take a look at: http://www.elasticsearch.o​rg/guide/reference/mapping
09:04 dipsy [ elasticsearch - guide - Mapping ]
09:04 clintongormley and especially http://www.elasticsearch.org/guide​/reference/mapping/core-types.html
09:04 dipsy [ elasticsearch - guide - Core Types ]
09:05 clintongormley so you could definitely use enums, but they would just be stored as an array in the backend
09:05 clintongormley s/backend/ES server/
09:08 clintongormley re memory usage - ES does use memory, and performs best when it has sufficient RAM
09:08 clintongormley it is, however, doing an enormous job, and ES is a highly tuned bit of kit
09:12 jsut_ joined #metacpan
09:17 jsut left #metacpan
11:40 confound clintongormley: it's not like his memory use complaint had anything to do with ES specifically, anyway. just java
11:40 clintongormley confound: sure
12:53 oalders mo, clintongormley: autarch mapped out some ideas for search rules here: https://github.com/CPAN-API/cpan-api/wiki/Ideas
12:53 dipsy [ Ideas - GitHub ]
13:57 punytan oalders: mo: yeah, introducing into version endpoint for web api is a nice thing to backward compat
14:35 autarch I think the quesiton of module vs distro search is a good one
14:36 autarch I think if someone searches for something that matches a single module, as long as that single module shows up first in the results, that's likely to be good enough
14:36 autarch currently the code basically strips out :: in a search term, I'd say that it should actually leave that alone so that a search for Moose::Util just finds that module
14:37 autarch or maybe it needs to be smarter, it can search both with and without ::, if it find a match using ::, go with that, otherwise search just on the words without ::
15:43 hoelzro left #metacpan
15:44 hoelzro joined #metacpan
15:44 hoelzro I always thought it'd be nice if a search like Foo::Bar:: limited the results to modules under the Foo::Bar namespace
15:45 hoelzro (not sure if Foo::Bar itself would be included)
15:46 hoelzro just my 2¢
16:25 hoelzro instead of that, a prefix: or namespace: option would be nice
19:34 Hinrik module:^Foo::Bar::
19:34 Hinrik or something
21:03 hoelzro yeah, that'd be fine
21:04 hoelzro I had another thought as well: what about P::M as an alias for Plack::Middlewaer?
21:04 hoelzro or C::P for Catalyst::Plugin, D::Z::P for Dist::Zilla::Plugin, etc
21:04 hoelzro there might be too many ambiguities, though...
21:14 confound http://beta.metacpan.org/release/URI-Dispatch doesn't have any modules indexed
21:14 dipsy [ URI-Dispatch-v1.1 - beta.metacpan.org ]
22:11 clintongormley left #metacpan

| Channels | #metacpan index | Today | | Search | Google Search | Plain-Text | summary