| Time |
S |
Nick |
Message |
| 01:01 |
|
|
hoelzro joined #metacpan |
| 01:18 |
|
|
klapperl joined #metacpan |
| 01:23 |
|
|
klapperl_ left #metacpan |
| 01:27 |
|
|
woldrich joined #metacpan |
| 02:36 |
|
|
garu joined #metacpan |
| 02:38 |
|
oalders |
mo: re: http://i.imgur.com/kmPZm.png looks great! (and good luck with your move) |
| 02:39 |
|
oalders |
punytan, mo: /pod/{module} did return JSON under the old API. |
| 02:40 |
|
oalders |
pod was a type in ElasticSearch, so any searches on this URL were direct searches on ES |
| 02:40 |
|
oalders |
with the new version mo has set up some convenience URLs. /pod is now a wrapper around ES, so it doesn't wrap the response in JSON |
| 02:41 |
|
oalders |
so, it's a different response, but it's actually organized better now |
| 02:41 |
|
oalders |
sorry about the breakage. we'll have a better plan for versioning going forward :) |
| 02:48 |
|
woldrich |
perhaps it's just me, but I find the greyed out description very hard to read. |
| 02:49 |
|
confound |
is there no rss feed of http://beta.metacpan.org/recent ? |
| 02:49 |
|
dipsy |
[ Search the CPAN - beta.metacpan.org ] |
| 02:52 |
|
confound |
actually I guess there aren't rss feeds in general |
| 02:53 |
|
woldrich |
confound: https://github.com/CPAN-API/me[…]uecomment-1353525 |
| 02:53 |
|
dipsy |
[ #76: RSS Feed of Recent Uploads - Issues - CPAN-API/metacpan-web - GitHub ] |
| 02:53 |
|
confound |
thanks |
| 02:54 |
|
autarch |
I think the font color is less of an issue than the fact that it's just a jumble of text |
| 02:55 |
|
autarch |
I think it'd be a lot more useful to extract the first few sentences from the DESCRIPTION section, and maybe fall back to the first section with text? |
| 02:55 |
|
autarch |
but I do think it could be a bit darker |
| 02:55 |
|
woldrich |
yes, it's also mentioned in an issue |
| 02:55 |
|
autarch |
ah |
| 02:55 |
|
autarch |
although what we _really_ need is better result ranking |
| 02:55 |
|
autarch |
the first is good, but the rest seem essentially arbitrary |
| 02:56 |
|
autarch |
I think that'll have to wait for more data to work with though |
| 03:00 |
|
|
alnewkirk left #metacpan |
| 03:24 |
|
oalders |
autarch, woldrich: http://i.imgur.com/kmPZm.png |
| 03:25 |
|
oalders |
but yes, result ranking is a big part of the GSoC plan |
| 03:25 |
|
autarch |
what am I looking at? |
| 03:25 |
|
oalders |
what mo is working on for search result layouts |
| 03:25 |
|
autarch |
oalders: yeah, that's what i was commenting on earlier |
| 03:25 |
|
autarch |
I was thinking about ranking, and I think there's two aspects of it |
| 03:25 |
|
autarch |
one is determining whether something is relevant to the search, the other is sorting relevant results |
| 03:26 |
|
autarch |
I think focusing on the relevance first is more important than sorting for the moment |
| 03:26 |
|
autarch |
the Moose example is a good one, since some of the results are totally irrelevant, like Mason::Moose and MARC::Moose |
| 03:27 |
|
autarch |
I think there's some weighting that could be done, though I'm not sure if this needs to happen at the API end |
| 03:27 |
|
autarch |
like if the word being searched for is in the distro name, that should rank very highly |
| 03:27 |
|
autarch |
if the word is in _most_ of the modules in a distro, it should also rank highly (so any DateTime:: distro ranks highly for DateTime) |
| 03:28 |
|
autarch |
by contrast, if the word is in just one module out of many (like Mason::Moose) it should rank lowly |
| 03:28 |
|
autarch |
all of this should probably take precedence over the use of the word in the POD itself |
| 03:29 |
|
oalders |
right. i'm wondering if these ideas should have a home in the wiki |
| 03:29 |
|
autarch |
I can put them there if that's useful |
| 03:29 |
|
autarch |
we may also need an explicit disassociation mechanism |
| 03:29 |
|
autarch |
I'm realizing that MARC::Moose is the name of the distro, but it's a bad name and should not rank highly for Moose searches, regardless of what I've just said ;) |
| 03:29 |
|
oalders |
:) |
| 03:30 |
|
oalders |
there will be some up and downvoting of modules which will help weight results |
| 03:31 |
|
autarch |
voting on the result quality, or the distro quality? |
| 03:31 |
|
autarch |
cause the issue here is that the result isn't relevant, not that the module is good or bad |
| 03:31 |
|
oalders |
distro quality. i see what you mean. it would also be helpful to flag whether a dist even belongs in a set of search results |
| 03:31 |
|
autarch |
right |
| 03:32 |
|
autarch |
distro quality is for the sorting aspect |
| 03:32 |
|
autarch |
but sorting is only useful if the results are relevant, otherwise you're sorting an essentially arbitrary list |
| 03:34 |
|
autarch |
where's the wiki? |
| 03:35 |
|
oalders |
https://github.com/CPAN-API/cpan-api/wiki |
| 03:35 |
|
dipsy |
[ What is MetaCPAN? - GitHub ] |
| 03:35 |
|
autarch |
aha, I was in the wrong repo |
| 03:35 |
|
oalders |
oops |
| 03:35 |
|
oalders |
you were in the right repo |
| 03:36 |
|
oalders |
https://github.com/CPAN-API/metacpan-web/wiki |
| 03:36 |
|
dipsy |
[ CPAN-API/metacpan-web - GitHub ] |
| 03:36 |
|
oalders |
i *think* the scoring is more on the front end |
| 03:36 |
|
autarch |
I can't see the wiki or something |
| 03:36 |
|
autarch |
well, it's hard to see how the frontend could do this efficiently |
| 03:36 |
|
oalders |
wow. i guess it hasn't been created yet. |
| 03:37 |
|
oalders |
how so? |
| 03:37 |
|
autarch |
is it going to pull _every_ result, then filter, than show the first 10? |
| 03:37 |
|
autarch |
then do that again for every page of results? |
| 03:37 |
|
oalders |
no |
| 03:37 |
|
oalders |
i believe in the es request you can assign more or less weight to terms etc |
| 03:37 |
|
oalders |
adjust your scoring that way |
| 03:37 |
|
autarch |
I see |
| 03:38 |
|
autarch |
but the thing I'm talking about where you give more weight if all the modules have a word, or explicitly derank something |
| 03:38 |
|
autarch |
that probably needs backend support |
| 03:39 |
|
oalders |
it may well. you know, this can probably go into the wiki i initially pointed you at. i'm not sure having more than one wiki is even helpful |
| 03:40 |
|
autarch |
ok |
| 03:40 |
|
autarch |
should I make a new page? |
| 03:40 |
|
woldrich |
oalders: yes that's what I, as well, commented on. It looks nice, except the font color that's really hard to read for me. But that might just be my old grumpy eyes/my old monitors/imagination |
| 03:40 |
|
apeiron |
another module that's not indexed, though this time I'm not at all bothered: common::sense |
| 03:41 |
|
oalders |
woldrich: oh, ok. i have to be better about reading the logs :) i think that font colour is hard to read as well |
| 03:41 |
|
autarch |
I do like the "related modules" bit that mo did |
| 03:41 |
|
oalders |
apeiron: common::sense is already on the radar :) |
| 03:42 |
|
oalders |
yeah, the related modules is helpful. rather than stacking the results with lots of modules in the same dist |
| 03:42 |
|
apeiron |
As I said, I'm not at all bothered that Marc "I want to fork perl5 for Unicode support" Lehmann's code isn't indexed. |
| 03:43 |
|
autarch |
oalders: yeah, having the same dist show up more than once was unhelpful, this is _way_ better |
| 03:43 |
|
oalders |
apeiron: there's no .pm in that dist |
| 03:43 |
|
apeiron |
oalders, indeed |
| 03:44 |
|
oalders |
i guess it gets built by sense.pm.PL |
| 03:44 |
|
apeiron |
nod |
| 03:44 |
|
oalders |
so, that's an edge case |
| 03:44 |
|
apeiron |
SCO picks it up somehow. |
| 03:45 |
|
oalders |
that's one thing about SCO. there has been a lot of time to work on the weird little things. we're just trying to figure all of that out |
| 03:45 |
|
oalders |
unfortunately without the knowledge that's in SCO |
| 03:45 |
|
apeiron |
indeed. |
| 03:45 |
|
oalders |
but at least now everyone has access to it :) |
| 03:49 |
|
autarch |
yeah, I think this has the potential to catch up to and surpass SCO _very_ quickly |
| 03:50 |
|
* apeiron |
has stopped using SCO |
| 03:51 |
|
oalders |
which reminds me, we should point the dist downloads to our own CPAN. would give us an idea of download numbers |
| 03:51 |
|
woldrich |
why is SCO closed source in the first place? if there's a short version of the history somewhere. |
| 03:53 |
|
* oalders |
has no idea |
| 03:55 |
|
autarch |
https://github.com/CPAN-API/cpan-api/wiki/Ideas |
| 03:55 |
|
dipsy |
[ Ideas - GitHub ] |
| 04:56 |
|
|
hoelzro|laptop joined #metacpan |
| 05:05 |
|
hoelzro|laptop |
what properties do you folks think an annotation should have? |
| 05:06 |
|
hoelzro|laptop |
currently, I'm thinking module, section, text |
| 05:07 |
|
oalders |
maybe the type of annotation |
| 05:07 |
|
oalders |
like a POD correction, POD addition, comment |
| 05:07 |
|
oalders |
for instance, if it's just a comment, you wouldn't want to create a patch out of it |
| 05:09 |
|
hoelzro|laptop |
mhmm |
| 05:09 |
|
hoelzro|laptop |
well, I was also thinking of starting simple; leaving the "create patch out of this annotation" functionality for later |
| 05:09 |
|
oalders |
for sure |
| 05:10 |
|
oalders |
but you *may* want to consider what kind of annotation it is |
| 05:10 |
|
hoelzro|laptop |
I'd like to get the initial revision out there as fast as I can, and then make iterative improvements upon it |
| 05:10 |
|
oalders |
but keeping it simple is good too |
| 05:10 |
|
hoelzro|laptop |
I suppose it couldn't hurt to put it in the document class |
| 05:10 |
|
oalders |
see the use cases and see if it makes sense to add further properties |
| 05:11 |
|
hoelzro|laptop |
well, I took some notes from the initial conversation we had about it |
| 05:11 |
|
hoelzro|laptop |
I think that pull requests/patches are the biggest piece of data so far |
| 05:12 |
|
hoelzro|laptop |
so...when making an ElasticSearchX::Model::Document, am I restricted to the kinds of types my attributes can have? |
| 05:12 |
|
hoelzro|laptop |
(ie. can I use enum) |
| 05:14 |
|
oalders |
i haven't looked closely at the types in ElasticSearchX::Model |
| 05:14 |
|
oalders |
that's mo's work. i believe he will release it to cpan eventually |
| 05:15 |
|
hoelzro|laptop |
he should; it's pretty cool! |
| 05:15 |
|
oalders |
it is :) |
| 05:15 |
|
oalders |
i've only looked at it briefly |
| 05:16 |
|
oalders |
i wrote the original parsing code. then mo changed a lot of it and i'm not quite up to speed |
| 05:16 |
|
hoelzro|laptop |
ah |
| 06:05 |
|
mo |
hoelzro|laptop: restricted to what types? |
| 06:06 |
|
hoelzro|laptop |
mo: I'm just wondering what kind of types I can use for ElasticSearchX::Model::Document attributes |
| 06:09 |
|
hoelzro|laptop |
for example, can I use enums? |
| 06:10 |
|
mo |
those are just array aren't they? |
| 06:10 |
|
hoelzro|laptop |
I think so |
| 06:11 |
|
mo |
then just use ArrayRef |
| 06:11 |
|
mo |
that'll work |
| 06:11 |
|
hoelzro|laptop |
hmm |
| 06:11 |
|
hoelzro|laptop |
well, actually, I don't think they are |
| 06:11 |
|
hoelzro|laptop |
I mean, I don't know what they are internally |
| 06:11 |
|
hoelzro|laptop |
probably an anonymous subtype of Str or something |
| 06:11 |
|
hoelzro|laptop |
or maybe even Any |
| 06:16 |
|
mo |
what does Enum do? |
| 06:17 |
|
hoelzro|laptop |
mo: enum allows you to construct a type that only accepts a set of given values |
| 06:17 |
|
hoelzro|laptop |
ex. has foo => (isa => enum([qw/bar baz/])) # foo can only be 'bar' or 'baz' |
| 06:17 |
|
mo |
ah I see |
| 06:17 |
|
mo |
yes those work |
| 06:18 |
|
hoelzro|laptop |
ok, good =) |
| 06:54 |
|
|
hoelzro|laptop left #metacpan |
| 08:53 |
|
|
clintongormley left #metacpan |
| 08:55 |
|
|
clintongormley joined #metacpan |
| 09:01 |
|
clintongormley |
autarch: oalders: re the search ranking and the search clustering - it would be very helpful indeed to have some "rules" defined, which will allow us to model how we store the data and how we model the queries |
| 09:02 |
|
clintongormley |
it definitely needs to be stored in the backend, with the frontend just contributing information (eg ranking) |
| 09:02 |
|
clintongormley |
also: how do we present the search results? What do we show if the user searches for Moose? What if he searches for Moose::Util? |
| 09:04 |
|
clintongormley |
hoelzro: re data types in ES, take a look at: http://www.elasticsearch.org/g[…]reference/mapping |
| 09:04 |
|
dipsy |
[ elasticsearch - guide - Mapping ] |
| 09:04 |
|
clintongormley |
and especially http://www.elasticsearch.org/g[…]g/core-types.html |
| 09:04 |
|
dipsy |
[ elasticsearch - guide - Core Types ] |
| 09:05 |
|
clintongormley |
so you could definitely use enums, but they would just be stored as an array in the backend |
| 09:05 |
|
clintongormley |
s/backend/ES server/ |
| 09:08 |
|
clintongormley |
re memory usage - ES does use memory, and performs best when it has sufficient RAM |
| 09:08 |
|
clintongormley |
it is, however, doing an enormous job, and ES is a highly tuned bit of kit |
| 09:12 |
|
|
jsut_ joined #metacpan |
| 09:17 |
|
|
jsut left #metacpan |
| 11:40 |
|
confound |
clintongormley: it's not like his memory use complaint had anything to do with ES specifically, anyway. just java |
| 11:40 |
|
clintongormley |
confound: sure |
| 12:53 |
|
oalders |
mo, clintongormley: autarch mapped out some ideas for search rules here: https://github.com/CPAN-API/cpan-api/wiki/Ideas |
| 12:53 |
|
dipsy |
[ Ideas - GitHub ] |
| 13:57 |
|
punytan |
oalders: mo: yeah, introducing into version endpoint for web api is a nice thing to backward compat |
| 14:35 |
|
autarch |
I think the quesiton of module vs distro search is a good one |
| 14:36 |
|
autarch |
I think if someone searches for something that matches a single module, as long as that single module shows up first in the results, that's likely to be good enough |
| 14:36 |
|
autarch |
currently the code basically strips out :: in a search term, I'd say that it should actually leave that alone so that a search for Moose::Util just finds that module |
| 14:37 |
|
autarch |
or maybe it needs to be smarter, it can search both with and without ::, if it find a match using ::, go with that, otherwise search just on the words without :: |
| 15:43 |
|
|
hoelzro left #metacpan |
| 15:44 |
|
|
hoelzro joined #metacpan |
| 15:44 |
|
hoelzro |
I always thought it'd be nice if a search like Foo::Bar:: limited the results to modules under the Foo::Bar namespace |
| 15:45 |
|
hoelzro |
(not sure if Foo::Bar itself would be included) |
| 15:46 |
|
hoelzro |
just my 2¢ |
| 16:25 |
|
hoelzro |
instead of that, a prefix: or namespace: option would be nice |
| 19:34 |
|
Hinrik |
module:^Foo::Bar:: |
| 19:34 |
|
Hinrik |
or something |
| 21:03 |
|
hoelzro |
yeah, that'd be fine |
| 21:04 |
|
hoelzro |
I had another thought as well: what about P::M as an alias for Plack::Middlewaer? |
| 21:04 |
|
hoelzro |
or C::P for Catalyst::Plugin, D::Z::P for Dist::Zilla::Plugin, etc |
| 21:04 |
|
hoelzro |
there might be too many ambiguities, though... |
| 21:14 |
|
confound |
http://beta.metacpan.org/release/URI-Dispatch doesn't have any modules indexed |
| 21:14 |
|
dipsy |
[ URI-Dispatch-v1.1 - beta.metacpan.org ] |
| 22:11 |
|
|
clintongormley left #metacpan |