Perl 6 - the future is here, just unevenly distributed

IRC log for #salt, 2018-03-26

| Channels | #salt index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:14 armin_ joined #salt
00:18 aphor @jsmith0012 what do you mean by "can you match a grain in a file.managed?"
00:24 armyriad joined #salt
00:37 masber joined #salt
00:39 zerocoolback joined #salt
00:44 keldwud joined #salt
00:49 exarkun joined #salt
00:50 armyriad joined #salt
01:38 jrklein joined #salt
01:57 ilbot3 joined #salt
01:57 Topic for #salt is now Welcome to #salt! <+> Latest Versions: 2016.11.9, 2017.7.4 <+> RC for 2018.3.0 is out, please test it! <+> Support: https://www.saltstack.com/support/ <+> Logs: http://irclog.perlgeek.de/salt/ <+> Paste: https://gist.github.com/ <+> See also: #salt-devel, #salt-offtopic, and https://saltstackcommunity.herokuapp.com (for slack) <+> We are volunteers and may not have immediate answers
02:08 shiranaihito joined #salt
02:11 cgiroua joined #salt
02:11 rlefort joined #salt
02:19 robawt joined #salt
02:33 zerocoolback joined #salt
02:36 keldwud joined #salt
02:46 evle joined #salt
02:49 JPT joined #salt
02:52 zerocoolback joined #salt
02:53 zerocoolback joined #salt
02:53 zerocoolback joined #salt
02:54 zerocoolback joined #salt
02:55 zerocoolback joined #salt
03:17 masber joined #salt
03:31 v0rtex joined #salt
03:33 asoc joined #salt
04:14 motherfsck joined #salt
04:19 indistylo joined #salt
04:49 masber joined #salt
04:54 keldwud joined #salt
05:29 Mattch joined #salt
05:33 cewood joined #salt
05:34 Guest73 joined #salt
06:12 tyx joined #salt
06:13 aruns joined #salt
06:16 Ricardo1000 joined #salt
06:27 OliverUK joined #salt
06:30 cyril-dunzo joined #salt
06:40 aruns__ joined #salt
07:06 Danny joined #salt
07:08 rgrundstrom joined #salt
07:14 aldevar joined #salt
07:16 Danny159 joined #salt
07:18 Hybrid joined #salt
07:19 aviau joined #salt
07:25 Danny159 Hi, last Friday I tried to post message to the google group salt-users, but it never became visible. I did get a notification that it was posted successfully, though still had to be approved. However it remained silent after that. Does anyone known what could be wrong?
07:35 darioleidi joined #salt
07:42 aruns joined #salt
07:47 marcus123 joined #salt
07:54 cewood joined #salt
07:56 rahavjv joined #salt
08:09 masber joined #salt
08:14 bdrung_work joined #salt
08:14 darioleidi joined #salt
08:16 Tucky joined #salt
08:36 rahavjv left #salt
08:36 rahavjv joined #salt
08:38 rollniak joined #salt
08:40 rahav joined #salt
08:40 rahav hi
08:40 rahav i have switched to TCP transport
08:41 rahav i have a 1000 minions connected to my salt master
08:41 rahav there are 1000 persistent connections to the master on 4505
08:41 rahav some of the minions are not reachable
08:42 rahav a look into the event bus, i see the job fired followed by a job fired to check the status after 5 seconds. I tried setting a timeout of 30s that didnt help.
08:55 inad922 joined #salt
08:57 masber joined #salt
09:03 masber joined #salt
09:07 aruns__ joined #salt
09:18 Naresh joined #salt
09:29 jas02 joined #salt
09:30 jas02 joined #salt
09:43 egilh joined #salt
09:53 inad922 joined #salt
09:54 Red_Devil joined #salt
09:55 Red_Devil Hi, i'm trying to use the csf module. But i keep getting the message: "Module 'csf' is not available."
09:56 Red_Devil I'm using salt 2017.7.4 (Nitrogen) on Centos 7, from the salt repoitory
10:05 pf_moore joined #salt
10:08 Udkkna joined #salt
10:23 KeplerOrange joined #salt
10:24 inad922 joined #salt
10:34 Red_Devil found it, the server had an outdated salt-minion
10:35 Guest73 joined #salt
10:43 aruns__ joined #salt
10:44 aruns joined #salt
10:46 aruns joined #salt
10:48 Udkkna joined #salt
10:51 BarBQ joined #salt
11:23 evle1 joined #salt
11:47 onslack <msmith> MTecknology: good point, however i believe that pillar cidr checks are done based on the ip of the minion's connection. worth confirming tho
11:49 sol7 joined #salt
11:57 nebuchadnezzar hello
11:57 jas02 joined #salt
12:03 nebuchadnezzar I have a state to mount a NFS4 but I need to wait for the network to be configured, is there a way to make my systemd.networkd state return True only when the DHCP bound the IP address to the interface? Otherwise the NFS state is executed to early.
12:07 edrocks joined #salt
12:12 Nahual joined #salt
12:14 jas02 joined #salt
12:16 nebuchadnezzar In the meantime I configured a retry
12:17 zerocoolback joined #salt
12:26 darioleidi joined #salt
12:27 darioleidi joined #salt
12:31 xet7 joined #salt
12:33 aruns__ joined #salt
12:38 ExtraCrispy joined #salt
12:42 cgiroua joined #salt
12:44 pcn Are you sure you want systemd to signal true/false?  Maybe https://docs.saltstack.com/en/latest/ref/beacons/all/salt.beacons.network_settings.html#module-salt.beacons.network_settings would work?
12:45 bdrung_work joined #salt
12:50 zerocoolback joined #salt
12:54 edrocks joined #salt
12:57 inad922 joined #salt
12:59 Hybrid1 joined #salt
13:01 aruns joined #salt
13:05 hoonetorg joined #salt
13:14 Danny159 Does anyone know if the master/minion connection can be forced to reconnect after a network.managed state has changed the IP address of the minion?
13:19 onslack <msmith> i think the only answer i've seen is "it depends". the minion is supposed to retry the connection automatically, however it's possible that the minion then fails and stop rather than continuing to retry. the minion logs should reveal what's happening
13:24 Danny159 I can see that whenever a subsequent state tries to retrieve files from the master, the minion does retry, but fails with a SaltReqTimeoutError (after 3 retries). At a network level communication is already possible almost immediately after the network.managed state finished.
13:25 eseyman joined #salt
13:25 jas02 joined #salt
13:25 Danny159 After the SaltReqTimeoutError (which takes about 4 minutes), the current salt state execution is skipped and any states after that are executed perfectly fine again.
13:28 Danny159 It looks like the minion (or master) is a little confused by the situation and it takes some time to recover. It fine that it needs some time, but I'd like to prevent that the very first state that runs after network.managed fails because of it.
13:28 onslack <msmith> might be worth looking through open issues in case it's a known problem, or open a new one if you can't find one
13:28 Danny159 Thanks, I'll do that.
13:29 rlefort joined #salt
13:37 _JZ_ joined #salt
13:38 shpoont joined #salt
13:42 jas02 joined #salt
13:43 straya joined #salt
13:43 straya left #salt
13:45 masuberu joined #salt
14:05 jas02 joined #salt
14:05 zerocoolback joined #salt
14:07 nixjdm joined #salt
14:12 KyleG joined #salt
14:12 KyleG joined #salt
14:12 masuberu joined #salt
14:13 racooper joined #salt
14:15 shpoont joined #salt
14:21 zerocoolback joined #salt
14:23 robawt joined #salt
14:27 cgiroua joined #salt
14:28 alvinstarr joined #salt
14:35 DammitJim joined #salt
14:40 indistylo joined #salt
14:45 zerocoolback joined #salt
14:50 tiwula joined #salt
15:02 KyleG joined #salt
15:02 KyleG joined #salt
15:05 jxs1_ joined #salt
15:07 Hybrid joined #salt
15:10 jas02 joined #salt
15:12 zerocoolback joined #salt
15:29 lordcirth_work Is there a way to override pillar_merge_lists in one pillar file?
15:30 ckonstanski joined #salt
15:35 zerocoolback joined #salt
15:39 beardedeagle joined #salt
15:41 DanyC joined #salt
15:44 babilen lordcirth_work: Not aware of it. Maybe you can achieve the same outcome in a different way?
15:47 lordcirth_work Well yeah, the program takes comma-separated strings anyway
15:48 onslack <msmith> perhaps you could merge yourself in jinja and output the resulting dict into pillar?
15:48 onslack <msmith> or use the python renderer instead of yaml
15:48 lordcirth_work It's not important enough to bother in this case, tbh.  It's already done.  Just wondering for future reference.
15:49 exarkun is salt-ssh only meant to be run on the same host as is running the salt master?
15:51 onslack <msmith> probably
15:52 exarkun bummer
15:53 lordcirth_work I have heard of devs using salt-ssh to test local git branches from their machines before pushing
15:53 babilen exarkun: Could you elaborate on that? It's perfectly fine to run salt-ssh on boxes that aren't a "salt-master"
15:54 babilen If that's advisable or not is a different discussion, but you can definitely run it locally on your laptop/workstation
15:54 onslack <msmith> interesting, i would have expected it to need master config, states, pillar, etc, in order to be able to function
15:54 babilen Sure, but you can have those locally on your "laptop"
15:55 dezertol joined #salt
15:55 babilen Hence the --verbose
15:55 onslack <msmith> so not a master, but has everything else a master has? :)
15:55 babilen Pretty much
15:55 exarkun [DEBUG   ] Missing configuration file: /etc/salt/master
15:55 exarkun No permissions to access "/var/log/salt/ssh", are you running as the correct user?
15:55 babilen Really depends on what exarkun wants to do
15:56 babilen exarkun: Ah, yeah .. I typically use Saltfile to work around those
15:56 tzero joined #salt
15:56 exarkun I am learning about salt-ssh to try to divine whether it is suitable for bootstrapping some nodes to use "regular" salt.
15:56 onslack <msmith> true, it does have config of its own
15:56 onslack <msmith> that's what bootstrap is for...
15:56 exarkun If I have to set up a bunch of stuff to make salt-ssh work then I guess it's not much help, since I could just spend that time setting up "regular" salt.
15:57 exarkun I assume "bootstrap" refers to https://docs.saltstack.com/en/latest/topics/tutorials/salt_bootstrap.html
15:57 exarkun And that's great but it only gets me salt, not my state and pillar files.
15:57 onslack <msmith> the chaos that is bootstrap.sh, yes
15:57 lordcirth_work Well, if you already have a state that, eg, adds the salt repos, changes the minion config file and so on, then applying it via salt-ssh should just work, right?
15:57 onslack <msmith> it'll set up a minion. the state and pillar usually come from a master
15:58 babilen http://paste.debian.net/1016767/ something like this
15:58 exarkun lordcirth_work: I don't have such a thing
15:58 lordcirth_work Setting up a salt-master should consist of adding the repo, installing the package, then git clone your states and pillar
15:58 babilen Easy to keep in a single repo/directory
15:58 exarkun lordcirth_work: So... 3 steps.  I was looking at salt-ssh to learn whether it would make that <3 steps.
15:59 babilen And yeah, you can use salt-ssh do bootstrap "proper" salt if that is what you want
15:59 exarkun 4 steps really, because master.conf
15:59 onslack <msmith> are you talking about setting up a new master from nothing, or do you have something you can already leverage?
15:59 lordcirth_work well yeah, have to copy master.conf out of states
15:59 babilen Assuming your instances run SSH and you have deployed suitable authentication
16:00 exarkun babilen: 5 steps!
16:00 exarkun I'm talking about a new master from nothing
16:00 babilen The above paste exemplifies keeping the salt-ssh config in a single location and does not require you to use a global one
16:00 babilen yes, you can use salt-ssh to bootstrap a new master and its minions
16:00 onslack <msmith> so bootstrap a minion, use salt masterless to promote to master, and all your config and states can be in git
16:01 exarkun Okay, I see the Saltfile, I didn't know about that.
16:01 babilen Just ensure you can use SSH and then fire the salt-formula states with suitable pillar data
16:01 exarkun you guys are going way too fast for me
16:01 babilen Saltfile allows you to specify command line options
16:02 exarkun It'll take me a week to digest all these options.
16:02 babilen Better start nomming then
16:03 onslack <msmith> i'm pretty sure i've seen someone else doing something very similar, although i don't remember who or the details
16:03 babilen I am in some setups
16:05 babilen I use https://github.com/saltstack-formulas/salt-formula (in /srv/salt/formulas/salt-formula referenced in config/master) and the pillar settings that are absolutely necessary for the master (and minions) to bootstrap and connect. The master can then "highstate itself" and "highstate minions"
16:05 aldevar left #salt
16:14 om2 joined #salt
16:15 onlyanegg joined #salt
16:24 indistylo joined #salt
16:31 masuberu joined #salt
16:36 Edgan babilen: I just use salt-ssh to bootstrap salt masters with the same code I use to maintain them normally.
16:38 robawt joined #salt
16:41 babilen Edgan: I use a "minimal" configuration for the bootstrapping that's applicable to all masters and then let them highstate themselves using project specific pillars
16:42 BitBandit joined #salt
16:44 Trauma joined #salt
16:50 zerocoolback joined #salt
16:57 indistylo joined #salt
17:10 lordcirth_work is there a clean way to make 1 state run after another *if* it's also being applied, but not pull it in as a dep?
17:11 zerocoolback joined #salt
17:20 onslack <msmith> yes, state files are run in order, top to bottom, if no requisites change that order
17:21 inad922 joined #salt
17:24 zer0def "applied" means "always applied" or "changed"?
17:25 Sketch joined #salt
17:31 jas02 joined #salt
17:34 jas02 joined #salt
17:44 jas02 joined #salt
17:45 lordcirth_work They are in different files.  In this case, apt-cacher-ng.server and apt-cacher-ng.client.  If both are applied and client points to localhost, server must go first, otherwise client will break apt before it can install the server
17:46 lordcirth_work But the client shouldn't depend on the server because it could be pointing somewhere else
17:48 zer0def oh, that manner, yeah, states are executed sequentailly in order defined by default
17:50 lordcirth_work So if I change the order in top.sls, it will work?
17:50 edrocks joined #salt
17:56 zer0def well, apt-cacher-ng.client would run first, because it's the first specified
17:56 jas02 joined #salt
17:57 zer0def i'd simply opt to have a spelled out requisite - communicates and enforces intent a lot better
17:57 jas02 joined #salt
18:00 onslack <msmith> or extend in some hitherto amazingly clever way :)
18:00 onslack <msmith> but otherwise change the order in top should be all it needs
18:10 rlefort joined #salt
18:19 ymasson joined #salt
18:24 jpsharp If I have a "watch" defined in a salt state file, do I have to run the state-apply for the watch to actually run, or is the minion constantly watching for that change?
18:26 zer0def `watch` is basically `require` and `onchanges` combined, with `onchanges` triggering reload behavior
18:30 schemanic joined #salt
18:30 jpsharp Okay, so the changes won't happen unless I run a state apply.
18:31 schemanic hello, during my initial highstates my minions are timing out, where can I extend the timeout value for that?
18:32 jpsharp ah, I need to use a beacon.
18:33 jpsharp I want to be able to trigger a git update on a bunch of remote servers when a repository is updated.
18:39 ddg_ joined #salt
18:42 lordcirth_work zer0def, the point is that apt-cacher-ng.client might instead depend on a  apt-cacher-ng.server on another machine.  Anyway it's moot because squid-deb-proxy is better.
18:44 zer0def in that case the requisite just moves from a state SLS to an orchestration SLS
18:53 aldevar joined #salt
18:56 racooper joined #salt
19:05 spiette joined #salt
19:08 MTecknology In a salt module, would this be a bad thing to do?  if not minion_id: minion_id = __grains__.get('id', '')
19:13 lordcirth_work MTecknology, where would you be getting minion_id from beforehand?
19:13 MTecknology something about it feels kinda wrong to me, but I also can't come up with anything better
19:14 jas02 joined #salt
19:14 MTecknology lordcirth_work: It's called using "{% set node = salt.st_util.parse_id() %}" but I wanted to support sending an altenate id (mostly for testing).
19:15 MTecknology s/but /. /
19:15 tyx joined #salt
19:22 shpoont joined #salt
19:25 Hybrid joined #salt
19:44 Hybrid joined #salt
19:46 lane_ joined #salt
19:48 jas02 joined #salt
19:49 aldevar left #salt
19:53 Guest73 joined #salt
19:58 jas02 joined #salt
20:02 robawt joined #salt
20:02 onlyanegg joined #salt
20:05 Laogeodritt joined #salt
20:06 jas02 joined #salt
20:13 jas02 joined #salt
20:15 schemanic joined #salt
20:20 cro joined #salt
20:21 robawt joined #salt
20:23 jas02 joined #salt
20:29 lordcirth_work So, we're a small IT department, and maybe ~12 people need to use Salt.  We've got a ton of existing machines to convert.  What workflow should we use?
20:32 jas02 joined #salt
20:32 MTecknology I'm in about that position, except <1/4 of the people. The plan is- roll new salt masters, and migrate everything one-by-one as we do a major os upgrade.
20:35 femnad joined #salt
20:41 exarkun joined #salt
20:41 lordcirth_work I am also hoping to make all 18.04 systems in Salt by manager fiat.  But I need a workflow planned out, documented, and trained by 18.04 launch/
20:43 jas02 joined #salt
20:50 Eugene My 2c: it is a little bit optimistic to write training materials for unlaunched products
20:51 Eugene At $DAYJOB, our rule is that you never use the .0 release of any software. Consequently, we are waiting until 18.04.1 before rolling out support
20:51 lordcirth_work The training is for salt, not 18.04.  18.04 just happens to be the deadline.
20:52 lordcirth_work Our Linux admins already know how to adapt stuff to 18.04, that's a standard Linuxy problem.
20:53 lordcirth_work But when 18.04 hits, they'll start installing 18.04 machines, and if they don't do them in Salt from the beginning, they'll never get migrated in reasonable time
20:53 MTecknology I'm already deploying 18.04 boxes
20:57 lordcirth_work Yes, there are already some dev boxes happening.  That's why I need to hurryu
20:58 jas02 joined #salt
21:01 lordcirth_work MTecknology, so what workflow do you use to make changes in Salt?
21:02 MTecknology at home, in my picture-perfect environment, I use vim path/foo; git commit; git push, and then wait.
21:03 MTecknology for $client, the future workflow will be pretty similar, but s/wait/salt '*' state.highstate/
21:07 lordcirth_work MTecknology, so, you have a git repo cloned on your workstation, you push, salt master automatically pulls and runs?
21:07 MTecknology yup
21:08 MTecknology The repo has a git hook that runs a wrapper script that sanitizes a salt-event command that the master reacts to by starting an orchestra.
21:09 MTecknology You could also use salt-api.
21:10 onlyanegg joined #salt
21:11 realrivyn joined #salt
21:11 realrivyn Is there something like cmd.run's onlyif that can be used to control an entire state?  I have a state that creates a directory and a bunch of stuff in it.  I don't want it to do anything if that directory already exists.
21:12 whytewolf onlyif is not cmd.run only
21:12 rivyn ok, well that's the only place I've used it so far, sorry.
21:12 rivyn the state has 13 different steps - would I need to add the same onlyif to every one then?
21:13 MTecknology it sounds like you mean "an sls file" and not "a state"
21:13 whytewolf ^
21:13 rivyn yes, I thought those were synonymous
21:13 MTecknology no
21:13 rivyn ok, then I mean for the whole sls file
21:13 whytewolf https://docs.saltstack.com/en/latest/ref/states/requisites.html < =- this is the document you want to read. as it has all the requisits to control how things change based on other states
21:13 rivyn whatever that is
21:14 rivyn I have read that extensively
21:14 rivyn it's confusing and I have difficulty getting anything to work as I'd expect.
21:14 rivyn I had a requisite from that page (onchanges) and the sls would work every other time, freeze halfway through every other time.
21:15 rivyn I could post the sls and minion log if you want
21:15 MTecknology that would definitely be helpful
21:15 zer0def you *could* abuse yaml anchors, if you really needed the same `onlyif` in every state provided in file… my guess is that you're probably over/under-doing something and without an example we're unable to tell more
21:15 rivyn alright, just wasn't sure anybody was willing to take the time to look at it.
21:15 MTecknology (just don't use pastebin.com, never ever use pastebin.com)
21:16 rivyn let me go paste some stuff
21:16 MTecknology if you don't ask a complete question, then no... most people won't bother
21:16 MTecknology this channel is kinda special, though
21:17 zer0def ghostbin it, at least you can provide a timeout period on your paste (for whatever that's worth)
21:17 MTecknology I tend to use dpaste.com for most things
21:18 rivyn here's the sls without the onchanges that wasn't working (I had onchanges: cluster_initialized in the cmd.run initially)
21:18 zer0def there's no link… yet
21:18 rivyn the problem with it as it is, is that if I run it twice, the second run blows away what the first did and redoes it
21:18 rivyn oops:  http://dpaste.com/1690ECD
21:19 MTecknology omg..
21:19 * MTecknology runs away
21:19 onlyanegg joined #salt
21:19 cgiroua_ joined #salt
21:19 rivyn why?
21:20 rivyn (and there's confirmation of my suspicion of no willingness to help :( )
21:20 zer0def rivyn: for starters, all those _directory states can be wrapped into one, using `names` instead of `name` and providing a list of those jinja vars
21:20 MTecknology that's a *LOT* of jinja
21:20 rivyn really??
21:20 MTecknology it makes me feel dirty
21:20 cgiroua__ joined #salt
21:20 zer0def that's pretty tame, MTecknology
21:20 whytewolf actually the jinja isn't to bad. it is all set in the top to just use {{}} in the body
21:21 rivyn well then please explain what it is that I'm supposed to do instead to accomplish the same result
21:21 rivyn please.
21:21 zer0def hold on, i'm still going through the file, that's just the first thing that jumped out at me
21:21 * MTecknology is also still reading
21:22 zer0def so this is basically a pgsql slave sls
21:22 zer0def or some crap
21:23 zer0def rivyn: what's `postgresql_basebackup` supposed to achieve?
21:23 MTecknology http://dpaste.com/1690ECD#line-128
21:23 rivyn it makes a filesystem copy of the data directory
21:23 zer0def ok, what for?
21:24 rivyn for disaster recovery / recovery in general
21:24 zer0def couldn't you just have a streaming slave replica then?
21:24 MTecknology wouldn't a file-copy most likely produce unrecoverable backups?
21:24 rivyn postgresql write-ahead log files are copied to a network drive as they are created.  But you also need a base backup as the starting point to use them for recovery
21:24 rivyn no, it's pg_basebackup.
21:25 rivyn it tells the database it's starting first, then tells it when it's done
21:25 rivyn so that the filesystem copy is safe
21:25 rivyn hence why it's an included command called pg_basebackup
21:25 zer0def you don't have to explain postgresql to me, i've done this way too many times than i can count
21:25 rivyn then why did you ask??
21:26 zer0def because i don't understand what's the intended purpose of this
21:26 rivyn of the sls?
21:26 rivyn it's to initialize a postgresql database cluster.
21:26 zer0def of the postgresql_basebackup state, specifically
21:26 rivyn it's named postgresql/initialize_cluster.sls
21:26 zer0def everything else is cake.
21:26 rivyn it creates the initial backup - the configuration is already set up to archive the wal files too
21:27 zer0def so you're required to have a wal archive?
21:27 rivyn it's not very relevant - you could remove that entirely from the SLS and it wouldn't affect my question.
21:27 zer0def that's sort-of true, yeah
21:27 MTecknology so this will re-initialize every time it's run, eh?
21:27 zer0def although i'd definitely advise against doing this, since every time this SLS is ran, it'll clear out {{ backup_directory + '/basebackup' }} and create a new one
21:27 rivyn yes, that's what I don't want
21:28 MTecknology then fix this-> http://dpaste.com/1690ECD#line-50
21:28 rivyn zer0def: yeah, that shouldn't happen
21:28 rivyn but nothing in the SLS should happen on a subsequent run with the same pillar arguments
21:28 zer0def you specified `clean: true`, so it sure should
21:29 rivyn MTecknology: how do I "fix this"?
21:29 MTecknology 21:13 <@whytewolf> https://docs.saltstack.com/en/latest/ref/states/requisites.html
21:29 shpoont joined #salt
21:29 zer0def MTecknology: what's wrong with `cluster_initialized`, precisely?
21:29 rivyn sigh
21:30 rivyn I have read that entire page many times.  I've tried everything I could guess to work.  Could you please offer a little bit of help?
21:30 MTecknology ah, crap- nevermind. I didn't look up the documentation for postgres_cluster.present
21:31 zer0def yeah, i was like "wot?"
21:31 whytewolf the only thing that should run every run is the cmd.run's
21:31 rivyn what documentation aside from https://docs.saltstack.com/en/latest/ref/states/all/salt.states.postgres_cluster.html?
21:31 rivyn and if it's there, what is it that's the concern?L
21:32 MTecknology except that the file state for postgresql_basebackup would re-run every time because the next command after it creates data that it will then remove
21:33 rivyn MTecknology: that would only blow away the backup though, not the cluster itself
21:33 zer0def joined #salt
21:33 rivyn so yeah, I should fix that, but it's not the primary concern
21:33 zer0def rivyn: that's on rhel, right?
21:33 rivyn no, ubuntu
21:33 zer0def i don't member which distro uses pg_ctlcluster
21:33 rivyn {%- if grains['osfinger'] == "Ubuntu-16.04" -%}
21:33 rivyn it's part of the postgresql-common package from PGDG repo
21:34 mrBen2k2k2k_ joined #salt
21:34 rivyn making the SLS's work across distributions is a task I'm not worrying about right now ;)
21:36 whytewolf rivyn: is start.conf in salt://postgresql/configuration/' ~ postgresql_version
21:36 rivyn zer0def: I'm not sure if PGDG's rpm packages include the same or not...IIRC the management of parallel-installed PG versions and multiple clusters came from Debian originally.
21:36 zer0def if that's the case, i'm pretty sure you'll appreciate the existence of `pg_createcluster`
21:36 rivyn yes, pg_createcluster is what postgres_cluster.present uses.
21:36 rollniak joined #salt
21:37 rivyn an annoying part about postgres_cluster module is that it doesn't work at all and throws errors if postgresql-common is not installed
21:37 zer0def do you intend to locate datadir somewhere else than the default /var/lib/postgresql/$version/$cluster ?
21:37 rivyn well, it was annoying before I started splitting up a bigger SLS into smaller onets
21:37 rivyn *ones
21:39 whytewolf state 'configuration' and 'startup_configuration' have the potential to loop. if start.conf is in the source directory of configuration
21:41 rivyn whytewolf: it's not
21:41 rivyn whytewolf: it's created by pg_createcluster I believe
21:42 whytewolf ok. just making sure.
21:42 MTecknology I suspect seeing the output of a sanitized 'salt-call -l debug <...>' would shed some more light on what's up.
21:42 rivyn not sure how to do that
21:42 whytewolf also for your question on how to setup onchanges
21:42 whytewolf https://gist.github.com/whytewolf/d6c8ba091b103043f62b45621eed9bdf
21:43 rivyn eh?
21:43 rivyn I had onchanges: cluster_initialized
21:43 rivyn how's that different?
21:43 whytewolf you didn't have the module
21:43 whytewolf the only requisit that works for currently is require
21:43 rivyn because I have the ID of the step?
21:43 rivyn ahh crap
21:44 rivyn weird syntax though
21:44 rivyn why does it need the module BEFORE the name of the step which occurs first?
21:46 whytewolf That I couldn't tell you. that choice was made long before my time in salt.
21:46 rivyn ok, I'll burn my eyes out reading the requisite docs more later...I had just assumed the use of ID worked for all
21:47 MTecknology mod.tag always felt natural to me..
21:48 rivyn tag?
21:48 rivyn isn't that mod.id?
21:48 whytewolf id OR name
21:48 zer0def MTecknology, whytewolf, rivyn and anyone else, feel free to review and comment on changes: https://ghostbin.com/paste/nzxmg
21:49 zer0def pretty sure, though, that `file: startup_configuration` beats the purpose of `file: configuration`
21:49 * whytewolf shudders at the use of names
21:49 rivyn zer0def: I've done similar locally, with the directory consolidation and such.  I removed file.directory form postgresql_basebackup though and just added it in with the rest of the directories
21:49 zer0def on top of just generally overdoing this as a whole
21:50 zer0def `file.directory` was fine, `clean` kwarg wasn't
21:50 rivyn there's some cases where I can consolidate things together that will help shorten my sls's for sure
21:50 rivyn zer0def: I know, but it made the code shorter
21:51 zer0def yeah, like starting off from `postgres_cluster.present` and move from there using templated configs
21:51 rivyn hmm?
21:51 rivyn zer0def: postgresql_basebackup is requiring itself in your paste?
21:51 zer0def i figure `configuration: file.recurse` is just complicating things for yourself
21:52 zer0def it's not requiring itself, the cmd requires the file
21:52 rivyn why?  I'm copying in 3 different config files with some jinja
21:52 rivyn zer0def: ahh, well that won't really fix it all the way.  I need an onlyif that tests if a specific file in the backup directory doesn't exist I think.
21:52 rivyn which I'm tyring to come up with now
21:53 zer0def instead of `file.recurse` you'd probably be better off using `file.managed` in a similar way i've presented with `postgres_user.present` and `cluster_initialized: file.directory`, templating out in the sls
21:53 rivyn The inverse would be easy with onlyif: test -f ...
21:53 zer0def that file.managed dependent on `postgres_cluster`
21:53 rivyn not sure how to do it for if the file does NOT exist
21:53 whytewolf you could always reverse the test instead with unless
21:54 rivyn zer0def: I don't follow, sorry.
21:54 zer0def i'll write up a concept of how i'd approach the issue, but you're mostly tangling up in yourself
21:55 rivyn tangling up in myself??
21:58 rivyn zer0def: I got an error trying your syntax to create multiple users at once
21:59 zer0def it probably also gave you a hint at what's wrong
21:59 rivyn zer0def: https://ghostbin.com/paste/vtq2r
21:59 whytewolf missing :
21:59 rivyn my bad, forgot colon
21:59 rivyn I was looking too hard at the line indicated by <====== ;)
22:01 zer0def some broad strokes on the same issue: https://ghostbin.com/paste/jmews
22:01 zer0def didn't give a crap about renaming states
22:02 zer0def also, i'm pretty sure that `postgres_user` permissions aren't given, unless they're actually `true` (they are, however, taken away, when `false` is provided, in case someone elevated)
22:02 rivyn one of your comments mentions using a service state.  Unfortunately, service start/stop starts and stops all clusters on the server at once.
22:03 rivyn I used that in my initial monolithic salt states
22:03 rivyn zer0def: yeah, I just confirmed that a minute ago
22:03 zer0def pretty sure i've seen some behavior where you could provide additional arguments on shell `service restart postgresql 10 main` or something similar, but would need to sit down and test this
22:04 rivyn it looked like an SLS I came up with which was very similar to your first example worked well, but on subsequent runs, the minion is hanging
22:04 rivyn If I kill the PID and start again it'll w ork
22:04 zer0def i see `service.running` does take kwargs, so might be that the systemctl call gets passed those
22:04 rivyn then it won't again
22:05 rivyn I end up seeing a new "starting a new job..." in the minion log every couple seconds
22:05 rivyn I can paste the whole log, gimme a sec
22:06 zer0def no need, i'm already worn out (it's midnight here)
22:06 zer0def unless someone else wants to hold your hand over your next set of hurdles
22:06 rivyn darn :(
22:06 rivyn well thanks for the help so far
22:07 rivyn why does help need to be accompanied by insult?
22:07 rivyn I'm trying my best
22:07 zer0def but most of your issues so far has been you shooting yourself in the foot and overcomplicating the situation for yourself
22:07 rivyn sorry I'm not as smart as you.  That's why I'm here asking.
22:07 rivyn trying to learn.
22:07 zer0def spin up an lxc/vm to test your setup on each iteration, it really helps
22:08 rivyn I do
22:14 rivyn here's the log when it gets into a never-returning state if anybody can help:
22:14 rivyn https://ghostbin.com/paste/vtq2r
22:15 rivyn If I remove the onchanges, it doesn't get stuck.  Same as when I used the ID there instead.
22:15 whytewolf what happens when you run that command when it has already been run?
22:16 rivyn whytewolf: it works fine
22:16 rivyn everything is Result: Clean
22:17 whytewolf odd.
22:17 rivyn you know what?
22:17 rivyn it's not running my new code
22:18 rivyn it's saying Succeeded: 13 which  was the number of commands in the original version
22:18 rivyn and I put a purposeful typo in the SLS and ran it again with no difference
22:18 whytewolf gitfs?
22:18 rivyn do you know why this would be?
22:18 rivyn I don't think so.
22:19 rivyn On our saltmaster, I just created a directory under /srv/salt and have been working there
22:20 rivyn each one of the directories under /srv/salt is a separate git repo
22:20 rivyn usually I just edit and state.apply and the changes are immediately effective
22:21 rivyn I just committed all my changes and tried again, no difference
22:22 whytewolf try salt-run fileserver.clear_cache followed by salt-run fileserver.update
22:25 rivyn returned "No cache was cleared" and "True", then I see the same behavior.  Which I pasted here:  https://ghostbin.com/paste/tepug
22:26 whytewolf i cound 13 states in your file still
22:26 rivyn I guess I'm really confused about what a state is then
22:26 rivyn but ok
22:27 whytewolf well basicly - names creates extra states.
22:27 rivyn so each name is a separate state
22:27 rivyn ok, well then nevermind that distraction.
22:28 whytewolf okay. it was one of the reasons i shuddered about the use of - names earlyer.
22:28 whytewolf over use can really be confusing and cause slow down
22:29 rivyn sorry I missed that, I thought I was being called an idiot by not using it to consolidate code more... ;)
22:29 rivyn anyways it doesn't matter.  I have the same problem as I did when I first came here
22:29 rivyn the SLS fails to apply until I kill the PID and execute the same state.apply again, with the log pasted at https://ghostbin.com/paste/vtq2r
22:30 rivyn it gets to the cmd.run and then ends up stuck I guess
22:30 rivyn something to do with the onchanges, because if I remove that, it works every time
22:31 zer0def joined #salt
22:31 whytewolf whats odd is that the item with onchanges shouldn't be running on a second attempt.
22:31 whytewolf only the first attempt.
22:31 rivyn err, no it doesn't, it still gets stuck
22:32 rivyn earlier I must have done something a bit different
22:32 rivyn that's why the onchanges is there, to keep it from executing on a second attempt
22:32 rivyn the log I pasted was of a FIRST attempt though.
22:32 zer0def i still think the biggest potential for fault is those two configuration `file` states
22:33 whytewolf those shouldn't cause the onchanges to trigger
22:33 whytewolf nor should they cause the command to not return [which is most likely causing the hang]
22:33 zer0def well, the cmd.run should start the cluster *after* configuration files are properly set (which isn't clear from the provided example)
22:34 whytewolf it is happening after the file changes.
22:35 whytewolf and it is set to onchanges on the postgres_cluster command not the file changes
22:35 zer0def oh yeah, based on the order specified in file
22:36 rivyn if file order isn't sufficient I'm open to ideas to improve that.
22:36 zer0def in which case, i'd opt for a configuration `file` state(s) `require`ing `postgres_cluster`, then have two `cmd.run`s, one for calling `pg_ctlcluster start` with an `unless` on the pidfile and `require`ing config files, another for reloading having `onchanges` on the first cmd and configuration `file`s
22:37 kwork joined #salt
22:37 zer0def given how you're modifying `start.conf`, there's probably a `module.run` on `systemd.systemctl_reload` somewhere to be squeezed between
22:37 whytewolf could do a creates on the pidfile.
22:37 zer0def yeah, that's right
22:38 rivyn pid file existing isn't the most accurate thing to go by is it?  It can exist even if the PID isn't running
22:38 rivyn shouldn't but can
22:38 zer0def in either case, spelling out requisites explicitly allows you for much easier drawing out of a directed graph of states
22:39 zer0def you can always run the shell equivalent if the file exists and check whether the pid contained inside exists and is a postgres process
22:39 shpoont joined #salt
22:39 zer0def depends on how thorough you want it to be
22:39 rivyn I'm sure I have lots of room for improvement and am open to hearing about them.  I'll take feedback into account and apply it to my work.  But I don't understand why what I have now isn't working.
22:40 whytewolf could get meta. check if the /proc pid directory exists for the pid in the pid file
22:40 zer0def pretty sure we're getting sidetracked here
22:40 whytewolf yeah. right now. figuring out why onchanges. causes a hang.
22:41 rivyn pg_ctlcluster 10 test start always works fine if I run it by hand, whether or not the cluster is already running.  So why is it getting hung in salt?
22:41 rivyn really it shouldn't hurt to just always run that
22:41 rivyn I took the onchanges out and still get the hang
22:41 whytewolf oh so the onchanges isn't causing the hang
22:42 zer0def curious - could you temporarily not change the `start.conf`?
22:42 rivyn also, even though it hangs salt, it DOES start the cluster
22:43 rivyn I'll comment out the startup_configuration step
22:43 whytewolf ... try adding `bg: True` to the cmd.run
22:44 rivyn same problem with start.conf changed back to auto
22:46 rivyn whytewolf: that appears to work
22:46 * rivyn tests a bit more things
22:46 whytewolf make sure you don't have a random process you don't expect
22:46 rivyn what do you mean?
22:46 rivyn it's working fine with bg: true in there, I put the onchanges back as well.
22:47 whytewolf bg: True tells the command "I don't care about the output. put it into the background"
22:47 zer0def a quick review of `sub start` in `pg_ctlcluster` suggests it does some waiting for some sockets to exist, might be something there?
22:47 rivyn speaking of onchanges, if onchanges: <id> isn't valid syntax then why didn't I get an error?
22:47 whytewolf because it doesn't error. it is ... kind of in a limbo right now.
22:47 rivyn whytewolf: I thought backgrounding would be risky since it might move on to the subsequent steps before the cluster had finished starting up
22:49 whytewolf that is a valid risk.
22:49 rivyn haha, I have a serious bug that's preventing that risk I just discovered
22:49 zer0def `database_users:postgres_user` should be a synchronization point, anyway
22:49 rivyn the postgres_user stuff is hitting the default cluster, not the one I intend
22:51 rivyn fixed that, and ran into the problem I expected with bg: true
22:51 rivyn the users fail to create because the db isn't done starting up by the time salt tries to create them
22:52 whytewolf is it possable the error was what was causing the hang to begin with?
22:52 rivyn no, because it just caused those states to return clean instead of doing anything
22:52 rivyn but I'll double-check to be sure.
22:53 rivyn still hanging with bg: true removed
22:53 zer0def well, for starters, after reviewing the function `pg_ctlcluster start` definitely returns when it's absolutely certain the cluster has started, so backgrounding that might be a bad idea
22:53 rivyn `pg_lsclusters` on the host shows the cluster as up and running
22:53 rivyn and I can connect to it fine
22:55 rivyn but the users aren't present in the database
22:56 zer0def well, yeah, they shouldn't be created until cmd.run returns
22:56 whytewolf right since the states don't get to that part
22:56 rivyn wall, thank you both for your help.  I need to head home shortly, but I'll be back in the morning and will review the IRC log first thing before resuming work on this
22:56 rivyn whytewolf: yeah, I was just confirming
22:56 zer0def you could `pstree` on the minion while `pg_ctlcluster start` is running to have an idea how far in `sub start` it actually is
22:57 rivyn haven't got pstree
22:57 whytewolf yeah. i think next steps is to find out why pg_ctlcluster isn't returning when it is being called from python
22:57 zer0def then htop switched to free view, just about anything to see whether pg_ctlcluster has any forked processes that may hold it
22:57 rivyn ah, pstree is part of psmisc package, got it now
22:58 rivyn https://ghostbin.com/paste/wyqvn
22:58 rivyn here's the pstree currently with the hung salt minion
22:58 zer0def hah, it's hung up on systemctl, that's funny
22:58 rivyn yeah, how?
22:59 whytewolf ... I think I know what is going on. I need to do some digging one moment.
22:59 zer0def heck if i know, so you could try slapping in `systemctl daemon-reload` as per `start.conf`'s instruction, in the same line `pg_ctlcluster start` runs, but before it
23:00 rivyn systemctl is what the `service` command uses?
23:00 zer0def on ubuntu since 16.04, yeah
23:02 zer0def or do `module.run` with the `systemd.systemctl_reload`; but a quick peek into `/lib/systemd/system/postgresql@.service` shows some interesting behavior, too
23:03 zer0def try adding `--skip-systemctl-redirect` to that `pg_ctlcluster` call
23:04 rivyn So, with the way I have the SLS files broken up now, you can use them to install multiple postgresql versions and install multiple clusters on each.  I don't want installing a new cluster to restart an existing one.  I also use pacemaker/corosync to manage the service, which doesn't use the service command either iirc
23:04 zer0def (since you're basically doing the service file's job for it, it'd probably be a good idea to mimick what it's doing)
23:05 rivyn zer0def: where are you getting that?  It's not in the man page for pg_ctlcluster
23:06 zer0def pulled it from `/lib/systemd/system/postgresql\@.service`, the systemd service file for postgresql
23:06 zer0def given your pstree output and the argument's name, in my sleepy mind it's worth a shot
23:07 rivyn error:  Unknown option: skip-systemctl-restart
23:08 rivyn oops, redirect
23:08 rivyn one
23:08 rivyn one sec
23:09 zer0def you sure?: https://ghostbin.com/paste/dvdhe
23:09 rivyn well, it works the same as with bg: true now
23:09 rivyn the users fail to create as it tries that too early it seems
23:10 rivyn aanriot: hec
23:10 zer0def well, at least it's *some* progress
23:10 zer0def the cmd.run doesn't actually hang now.
23:10 rivyn the errors in the minion log indicate that pg isn't running whet it tries to create the users
23:10 zer0def got an output of that last run?
23:11 rivyn will get it
23:12 zer0def i mean, if it still behaves this way, you might as well put a port checking loop somewhere between `pg_ctlcluster start` and user creation
23:13 rivyn actually I fixed that....
23:13 zer0def the cause was…?
23:13 rivyn needed to set db_host: {{ socket_directory }} for the postgres_user module
23:13 zer0def oh yeah, because it's not the default, main cluster
23:13 rivyn port alone wasn't enough
23:14 rivyn looks to work awesome now.  I really need to run, but if you're around tomorrow I'd love to chat more about systemctl and such to understand this
23:14 zer0def and by "port" i did mean socket, just forgot about the unix socket pg spawns
23:15 rivyn I'd also like to go over your last paste and consider further improvements I can make
23:15 zer0def i personally find it irritating, because in a regular non-systemd case, this probably wouldn't have happened in the first place
23:16 onlyanegg joined #salt
23:16 rivyn thanks guys, be back tomorrow
23:16 zer0def well, the main difference between what you have now is more explicit requisite stating and using templated files through `file.managed`
23:16 zer0def that's relatively straight-forward via trial and error
23:17 zer0def don't forget to say your profanities towards systemd devs for wasting as much of your time, before you go rest
23:19 zer0def (i'm kidding, btw, but if it annoyed you, you might as well murmur something under your nose)
23:21 rlefort joined #salt
23:22 onlyaneg1 joined #salt
23:45 rlefort joined #salt

| Channels | #salt index | Today | | Search | Google Search | Plain-Text | summary