Perl 6 - the future is here, just unevenly distributed

IRC log for #metacpan, 2016-12-06

| Channels | #metacpan index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:05 reyjrar mickey / ranguard : re:aliases and multiple indexes.. looking, but I'm not sure it's a true ES limitation, might be a imposed from the Search:::Elasticsearch, will confirm
00:53 reyjrar From my review of the docs for the API, _search should work against aliases.
00:53 reyjrar PUT/POST won't because you have to index documents to the index, not the alias.
00:54 reyjrar GET /alias/ may not work, but POST to /alias/type/_search should
00:57 reyjrar GET /<index_name>/ is a hook to the "get index information" which won't support aliases
00:57 reyjrar That call shouldn't be necessary
05:26 mickey reyjrar: thanks for looking into it. yes, from what we've seen, it seem like losing GET is breaking too many things.
06:02 melezhik joined #metacpan
07:31 pombreda joined #metacpan
07:45 ribasushi joined #metacpan
09:25 osfabibisi joined #metacpan
10:36 edward joined #metacpan
10:43 melezhik joined #metacpan
11:11 Relequestual joined #metacpan
11:35 nakiro joined #metacpan
11:35 neilb joined #metacpan
12:11 karjala With all the information from the 02packages file, I managed to allow users to search by module with wrong case letters.
12:12 karjala I have the complete list of modules.
12:12 karjala (I think)
13:48 Guest69 joined #metacpan
14:31 melezhik joined #metacpan
15:14 melezhik joined #metacpan
15:18 chansen joined #metacpan
15:47 osfabibisi joined #metacpan
18:59 metacpan joined #metacpan
18:59 metacpan [metacpan-api] mickeyn created mickey/mapping_script_time_slices (+1 new commit): https://git.io/v10I8
18:59 metacpan metacpan-api/mickey/mapping_script_time_slices a6c3c32 Mickey Nasriachi: default type copy to monthly slices (unless query is provided)
18:59 metacpan left #metacpan
19:03 metacpan joined #metacpan
19:03 metacpan [metacpan-api] mickeyn force-pushed mickey/mapping_script_time_slices from a6c3c32 to 9bde4c4: https://git.io/v10LZ
19:03 metacpan metacpan-api/mickey/mapping_script_time_slices 9bde4c4 Mickey Nasriachi: default type copy to monthly slices (unless query is provided)
19:03 metacpan left #metacpan
19:10 metacpan joined #metacpan
19:10 metacpan [metacpan-api] mickeyn force-pushed mickey/mapping_script_time_slices from 9bde4c4 to 781e4ec: https://git.io/v10LZ
19:10 metacpan metacpan-api/mickey/mapping_script_time_slices 781e4ec Mickey Nasriachi: default type copy to monthly slices (unless query is provided)
19:10 metacpan left #metacpan
19:12 uree joined #metacpan
19:40 metacpan joined #metacpan
19:40 metacpan [metacpan-web] haarg created haarg/pcc-pvc (+1 new commit): https://git.io/v10Gh
19:40 metacpan metacpan-web/haarg/pcc-pvc 5ffcb56 Graham Knop: switch to Params::ValidationCompiler
19:40 metacpan left #metacpan
19:46 neilb joined #metacpan
20:00 tmetro1 Can the metacpan API be used to answer a question like: "which CPAN module was uploaded for the first time in 2016?" and can it return some metric that acts as a proxy for popularity? (I'd like to say downloads, but as downloads aren't generally served by metacpan, it wouldn't be able to track that.)
20:06 Grinnz some downloads are. but yeah, each mirror would only know its own stats
20:06 Grinnz cpamn has a slightly more complete picture, but only from downloads using cpanm of course
20:07 Grinnz not sure if that data is accessible anywhere
20:08 trs tmetro1: you can probably find the ES query syntax to find the minimum release date by distribution
20:08 trs and then check that against 2016
20:08 trs it's probably not straightforward though.
20:08 tmetro1 One of the proxies another developer here suggested was the list of other modules that are dependent on it. I'm pretty sure that's available in the meta data. Don't know if it can be queried for with other criteria.
20:09 trs tmetro1: you want to look at the reverse dependencies endpoint
20:09 mst revdeps are totally a thing, and the cpan river reports are based on them
20:10 trs ah, actually, river data is in the v1 api now
20:10 trs so you could use that directly, perhaps
20:10 trs once you find your dists which were first released upon the world in 2016
20:11 tmetro1 > find the minimum release date by distribution
20:11 tmetro1 So sounding like we might need to do query A, then iterate over that result set with query B to refine it, rather than getting what we want in one shot. That should be doable.
20:12 mst one of these years I'll get around to doing a relational store for metacpan
20:12 trs tmetro1: correct.  most complex things in ES you'll need to use multiple queries for, due to lack of joins.
20:13 trs mst: we'd all like that :)
20:13 mst I hope "one of these years" indicates how deep it is on the yak stack for me
20:13 mst (though if somebody else looked at it I'd happily help)
20:13 tmetro1 As we thought about it more, we realized we wouldn't be able to use a window as recent as 2016, as it takes time for a module to find an audience. One released in January is going to obviously have far better numbers than one released in September. So we'll have to look at a window further back, and give each module the same 12 months (or whatever). But that then means looking at the dependency graph at specific points in time, which is probably not doable f
20:15 trs you got cut off due to IRC message length limits.
20:15 trs at "not doable"
20:15 trs indeed, the rev dep graph at points in time is likely not doable.
20:16 tmetro1 ...not doable from the API.
20:16 tmetro1 (Ah, yes, old Pidgin bug. Doesn't warn when limits are hit.)
20:18 mst what you *could* do, however, is to check revdeps now, then walk backwards along those revdeps to see if they were revdeps at the 12 month mark
20:18 mst which I suspect would get you close enough
20:18 tmetro1 It may be simpler to just alter the rules for what we are looking for. The original thought was just to use some objective criteria (so we didn't have to have a judging panel) to find a "best of" module for 2016 for the fun of it, and give the author some accolades from Boston.pm.
20:19 tmetro1 A perhaps more useful question would be to surface a top-5 list of modules that are rising in relevance, to draw developers attention to them that they should add them to their portfolio.
20:20 tmetro1 Then we'd use some formula that combines quantity of dependent modules and age of the module. That would avoid the time window problem.
20:26 Grinnz tmetro1: keep in mind the reverse dependency (and forward dependency, for that matter) data is based only on statically declared dependencies; a distribution can choose to add or remove non-configure prereqs however it wants upon installation
20:26 Grinnz which may or may not be specific to the system being installed to
20:27 trs but also keep in mind that nothing's perfect and you work with the data you have, not the data you wish you had. :)
20:27 trs (presuming you can't magically make the data you wish you had appear)
20:29 tmetro1 > choose to add or remove non-configure prereqs
20:29 tmetro1 So you're saying a prereq that is optional might not be listed in the metadata for module A, but might be for module B? I guess if most authors adhere to the strict definition of prereqs and list only hard dependencies, that'll be fine.
20:30 tmetro1 Is there a "suggests" field, as there is with Debian packages? (That would be where the optional stuff should get listed.)
20:30 mst there is, but that's not what we're talking about
20:30 Grinnz tmetro1: i'm saying it might not be listed in metadata at all, and get added to the dependencies at install time on machine A and not on machine B
20:30 mst for example, Moo needs MRO::Compat on perl 5.8
20:31 Grinnz or it might be in the metadata, and removed on machine B
20:31 mst so on 5.8 MYMETA will have MRO::Compat but it's not in META at all
20:31 Grinnz essentially, META.json is only authoritative for the configure phase, by necessity since those dependencies are needed to figure out the rest
20:32 Grinnz but for your purposes, theres not really anything you can do about it; just making sure you know about it
20:35 tmetro1 OK, sure, dynamically determined dependencies that are impacted by the specific environment.
20:35 tmetro1 Yes, good to be aware of that. not sure it'll have a big impact on the type on info we're trying to surface. It might give a boost to a "glue" modules used to make older environments compatible with newer modules. If such modules can be identified by human review, they could be manually dropped off the list.
20:35 oalders tmetro1: the ++ data is available as well, if you want to throw that into the mix
20:35 tmetro1 And what is that?
20:35 tmetro1 Oh, ratings, sure.
20:36 oalders right, but not the CPANRatings, although they are also available
20:36 tmetro1 I guess that wasn't my first thought, as in my observation most modules have a statistically insignificant number of ratings. Hard to tell what represents a trend.
20:37 oalders https://metacpan.org/favorite/recent vs https://metacpan.org/favorite/leaderboard
20:38 tmetro1 So first is just a chronological list of modules most recently ++'ed, and the latter ranks them by count. Yeah, we'll take a look at that.
20:39 tmetro1 Thanks Grinnz, trs, mst and oalders for your suggestions. We'll have a poke at the API and see if we can arrive at something useful.
20:41 mst I did at one point try to get stats from cpan mirrors but it became kind of obvious that there are so many of them and so many people use local mirrors and etc. that the data wasn't going to be particularly useful
20:43 ranguard we could probably do something reasonably easy with the logs from cpan.metacpan.org - I can dump them into S3 anonymised if someone did want to take that on
20:44 mst as I say, I mostly concluded I wouldn't get anything approximating a representative sample
20:45 ranguard mst: what do you mean, everyone should use cpan.mc.org ;)
20:46 mst ranguard: http://trout.me.uk/seriously.jpg
20:46 ranguard but fair point, would have to be just 1 small factor in any sort of metrics
20:48 tmetro1 Doesn't Debian address this by having a "popularity contest" feature that user opt-in to and then they treat it as a statistical sampling? That'll obviously have a bias, but was probably the best compromise that respected privacy.
21:04 neilb joined #metacpan
21:13 karjala Would anyone find useful a feature on perlmodules.net that would track changes to dependencies of your CPAN modules? So, you would only type the name of your module, and you would get a feed of changes of its deps. Or you would type your PAUSE ID, and would get a feed tracking the deps of all your modules.
21:13 karjala (looking for something to do)
22:45 ether this search turns up MIYGAWA/Carton-v1.0.28 and MIYAGAWA/carton-v0.9.15 -- https://metacpan.org/search?q=carton&amp;search_type=modules
22:45 ether the second one should be dropped
22:57 oalders This is the problem: Carton::TreeNode                  undef  M/MI/MIYAGAWA/carton-v0.9.15.tar.gz

| Channels | #metacpan index | Today | | Search | Google Search | Plain-Text | summary