Time |
Nick |
Message |
00:01 |
|
owlbot joined #gluster |
00:05 |
|
EinstCra_ joined #gluster |
00:05 |
|
owlbot joined #gluster |
00:07 |
|
haomaiwa_ joined #gluster |
00:09 |
|
owlbot joined #gluster |
00:13 |
|
owlbot joined #gluster |
00:17 |
|
owlbot joined #gluster |
00:22 |
|
owlbot joined #gluster |
00:26 |
|
owlbot joined #gluster |
00:30 |
|
owlbot joined #gluster |
00:31 |
|
longwuyu1n joined #gluster |
00:31 |
longwuyu1n |
hi |
00:31 |
glusterbot |
longwuyu1n: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer. |
00:34 |
|
owlbot joined #gluster |
00:38 |
|
owlbot joined #gluster |
00:42 |
|
owlbot joined #gluster |
00:46 |
|
owlbot joined #gluster |
00:48 |
|
Alghost_ joined #gluster |
00:50 |
|
owlbot joined #gluster |
00:54 |
|
owlbot joined #gluster |
00:58 |
|
owlbot joined #gluster |
01:02 |
|
owlbot joined #gluster |
01:07 |
|
plarsen joined #gluster |
01:18 |
|
harish joined #gluster |
01:23 |
|
haomaiwa_ joined #gluster |
01:35 |
|
plarsen joined #gluster |
01:40 |
|
nbalacha joined #gluster |
01:47 |
|
nishanth joined #gluster |
01:48 |
|
EinstCrazy joined #gluster |
01:49 |
|
baojg joined #gluster |
01:57 |
|
chromatin joined #gluster |
02:04 |
|
nangthang joined #gluster |
02:05 |
|
caitnop joined #gluster |
02:14 |
|
pppp joined #gluster |
02:14 |
|
haomaiwa_ joined #gluster |
02:21 |
|
rafi joined #gluster |
02:25 |
|
mtanner joined #gluster |
02:29 |
|
ovaistar_ joined #gluster |
02:31 |
|
muneerse2 joined #gluster |
02:36 |
|
Wizek_ joined #gluster |
02:48 |
|
ilbot3 joined #gluster |
02:48 |
|
Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/ |
02:49 |
|
BuffaloCN left #gluster |
02:52 |
|
ashiq joined #gluster |
02:54 |
|
haomaiwa_ joined #gluster |
02:55 |
|
Lee1092 joined #gluster |
03:01 |
|
haomaiwa_ joined #gluster |
03:02 |
|
arcolife joined #gluster |
03:06 |
|
nangthang joined #gluster |
03:24 |
|
sakshi joined #gluster |
03:29 |
|
ovaistariq joined #gluster |
03:31 |
|
baojg joined #gluster |
03:36 |
|
overclk joined #gluster |
03:39 |
|
nbalacha joined #gluster |
03:53 |
|
ramteid joined #gluster |
03:53 |
|
atinm joined #gluster |
03:58 |
|
shubhendu joined #gluster |
04:01 |
|
haomaiwa_ joined #gluster |
04:05 |
|
kanagaraj joined #gluster |
04:07 |
|
itisravi joined #gluster |
04:08 |
|
itisravi joined #gluster |
04:08 |
|
ppai joined #gluster |
04:23 |
|
nehar joined #gluster |
04:29 |
|
karthikfff joined #gluster |
04:33 |
|
Alghost_ joined #gluster |
04:34 |
|
Manikandan joined #gluster |
04:52 |
|
hgowtham joined #gluster |
04:53 |
|
RameshN joined #gluster |
04:54 |
|
ndarshan joined #gluster |
04:56 |
|
jiffin joined #gluster |
05:00 |
|
PotatoGim joined #gluster |
05:00 |
|
ovaistariq joined #gluster |
05:01 |
|
haomaiwa_ joined #gluster |
05:09 |
|
nehar joined #gluster |
05:10 |
|
Bhaskarakiran joined #gluster |
05:16 |
|
gem joined #gluster |
05:20 |
|
pppp joined #gluster |
05:23 |
|
EinstCra_ joined #gluster |
05:25 |
|
ramky joined #gluster |
05:30 |
|
poornimag joined #gluster |
05:32 |
|
Saravanakmr joined #gluster |
05:38 |
|
python_lover joined #gluster |
05:55 |
|
harish_ joined #gluster |
05:57 |
|
nishanth joined #gluster |
05:59 |
|
atalur joined #gluster |
06:01 |
|
karnan joined #gluster |
06:01 |
|
haomaiwa_ joined #gluster |
06:05 |
|
nangthang joined #gluster |
06:06 |
|
skoduri joined #gluster |
06:06 |
|
gowtham joined #gluster |
06:06 |
|
kdhananjay joined #gluster |
06:07 |
|
merp_ joined #gluster |
06:13 |
|
Alghost_ joined #gluster |
06:15 |
|
ppai joined #gluster |
06:17 |
|
harish_ joined #gluster |
06:17 |
|
anil joined #gluster |
06:18 |
|
atinm joined #gluster |
06:21 |
|
Bhaskarakiran joined #gluster |
06:21 |
|
Bhaskarakiran joined #gluster |
06:29 |
|
kdhananjay joined #gluster |
06:29 |
|
python_lover joined #gluster |
06:31 |
|
DV joined #gluster |
06:36 |
|
aravindavk joined #gluster |
06:38 |
|
EinstCrazy joined #gluster |
06:48 |
|
RameshN joined #gluster |
06:58 |
|
atinm joined #gluster |
07:01 |
|
ovaistariq joined #gluster |
07:01 |
|
haomaiwa_ joined #gluster |
07:04 |
|
baojg joined #gluster |
07:07 |
|
robb_nl joined #gluster |
07:09 |
|
mhulsman joined #gluster |
07:09 |
|
mhulsman1 joined #gluster |
07:16 |
|
[Enrico] joined #gluster |
07:18 |
|
nangthang joined #gluster |
07:20 |
|
mobaer joined #gluster |
07:22 |
|
jtux joined #gluster |
07:24 |
|
DV joined #gluster |
07:28 |
|
mbukatov joined #gluster |
07:34 |
|
harish_ joined #gluster |
07:38 |
|
overclk joined #gluster |
07:42 |
|
Humble joined #gluster |
07:48 |
|
kdhananjay joined #gluster |
07:56 |
|
aravindavk joined #gluster |
07:58 |
|
DV joined #gluster |
08:00 |
|
wolsen joined #gluster |
08:01 |
|
[diablo] joined #gluster |
08:01 |
|
haomaiwa_ joined #gluster |
08:01 |
|
deniszh joined #gluster |
08:05 |
|
RameshN joined #gluster |
08:11 |
|
jri joined #gluster |
08:16 |
|
owlbot joined #gluster |
08:20 |
|
owlbot joined #gluster |
08:24 |
|
owlbot joined #gluster |
08:27 |
|
Slashman joined #gluster |
08:28 |
|
merp_ joined #gluster |
08:32 |
|
itisravi joined #gluster |
08:39 |
|
hackman joined #gluster |
08:41 |
|
ira joined #gluster |
08:43 |
|
ctria joined #gluster |
08:51 |
|
DV joined #gluster |
08:55 |
|
fsimonce joined #gluster |
08:57 |
|
nehar joined #gluster |
08:59 |
|
Bhaskarakiran joined #gluster |
08:59 |
|
ivan_rossi joined #gluster |
09:01 |
|
haomaiwang joined #gluster |
09:02 |
|
ovaistariq joined #gluster |
09:05 |
|
nbalacha joined #gluster |
09:06 |
|
python_lover joined #gluster |
09:19 |
|
Ulrar left #gluster |
09:23 |
|
harish_ joined #gluster |
09:36 |
|
owlbot joined #gluster |
09:40 |
|
owlbot joined #gluster |
09:44 |
|
owlbot joined #gluster |
09:56 |
|
DV joined #gluster |
10:01 |
|
owlbot joined #gluster |
10:01 |
|
haomaiwa_ joined #gluster |
10:03 |
|
ira joined #gluster |
10:06 |
|
poornimag joined #gluster |
10:06 |
|
nbalacha joined #gluster |
10:27 |
|
arcolife joined #gluster |
10:28 |
|
Bhaskarakiran joined #gluster |
10:47 |
|
mhulsman joined #gluster |
10:54 |
|
itisravi joined #gluster |
10:55 |
|
DV joined #gluster |
10:56 |
|
mhulsman1 joined #gluster |
11:00 |
|
aravindavk joined #gluster |
11:02 |
|
Wizek_ joined #gluster |
11:02 |
|
ovaistariq joined #gluster |
11:07 |
|
abyss__ joined #gluster |
11:11 |
|
ppai joined #gluster |
11:13 |
|
poornimag joined #gluster |
11:25 |
|
Wizek joined #gluster |
11:33 |
|
haomaiwa_ joined #gluster |
11:41 |
|
jockek joined #gluster |
11:43 |
|
sghatty_ joined #gluster |
11:43 |
|
fyxim joined #gluster |
11:45 |
|
sankarshan_away joined #gluster |
11:48 |
|
haomaiwa_ joined #gluster |
11:49 |
|
virusuy joined #gluster |
11:52 |
|
Nakiri__ joined #gluster |
11:53 |
|
dron23 joined #gluster |
12:00 |
|
owlbot joined #gluster |
12:01 |
|
haomaiwang joined #gluster |
12:02 |
|
owlbot joined #gluster |
12:03 |
|
hagarth joined #gluster |
12:10 |
|
yoavz joined #gluster |
12:16 |
|
kdhananjay joined #gluster |
12:18 |
|
ppai joined #gluster |
12:26 |
|
nehar joined #gluster |
12:42 |
|
nbalacha joined #gluster |
12:56 |
|
mobaer joined #gluster |
13:01 |
|
haomaiwa_ joined #gluster |
13:12 |
|
poornimag joined #gluster |
13:17 |
|
ira joined #gluster |
13:23 |
|
Nakiri__ joined #gluster |
13:25 |
|
theron joined #gluster |
13:29 |
|
chirino_m joined #gluster |
13:34 |
|
ParsectiX joined #gluster |
13:35 |
|
aravindavk joined #gluster |
13:36 |
|
sathees joined #gluster |
13:40 |
|
chromatin joined #gluster |
13:45 |
|
voobscout joined #gluster |
13:49 |
|
johnmilton joined #gluster |
13:53 |
|
unclemarc joined #gluster |
14:01 |
|
voobscout joined #gluster |
14:01 |
|
haomaiwang joined #gluster |
14:08 |
|
DV joined #gluster |
14:10 |
|
natarej_ joined #gluster |
14:13 |
|
hamiller joined #gluster |
14:17 |
|
kdhananjay joined #gluster |
14:24 |
|
skoduri joined #gluster |
14:25 |
|
julim joined #gluster |
14:26 |
|
theron joined #gluster |
14:26 |
|
theron joined #gluster |
14:27 |
|
natarej joined #gluster |
14:29 |
|
Ulrar joined #gluster |
14:31 |
Ulrar |
Hi, I have trouble understanding something, if someone can help. I have 3 nodes configured for one volume, with a replica count set to 3 (Number of Bricks: 1 x 3 = 3). There is about 600 GB worth of space on each, and with a replica of 3 I'd expect to see a volume of 600 Gb, but I see twice that in df -h |
14:32 |
Ulrar |
I was thinking of lowering the replica to 2, but it seems weird |
14:35 |
|
wnlx joined #gluster |
14:39 |
|
Wizek joined #gluster |
14:40 |
Ulrar |
Oh wait, it's my fault. Forgot I was using RAID 5, glusterfs is displaying the correct number here, my bad |
14:44 |
|
voobscout joined #gluster |
14:44 |
|
skylar joined #gluster |
14:53 |
|
chirino joined #gluster |
14:56 |
|
plarsen joined #gluster |
14:58 |
|
ekuric1 joined #gluster |
14:59 |
|
DV joined #gluster |
15:01 |
|
haomaiwa_ joined #gluster |
15:04 |
|
ovaistariq joined #gluster |
15:05 |
|
tswartz joined #gluster |
15:11 |
|
amye joined #gluster |
15:19 |
|
rwheeler joined #gluster |
15:22 |
|
hchiramm joined #gluster |
15:22 |
|
Slashman joined #gluster |
15:22 |
|
DV__ joined #gluster |
15:28 |
|
DV joined #gluster |
15:31 |
|
atalur joined #gluster |
15:34 |
|
NuxRo joined #gluster |
15:35 |
|
wushudoin joined #gluster |
15:37 |
|
rGil joined #gluster |
15:49 |
|
theron joined #gluster |
15:53 |
|
farhorizon joined #gluster |
15:53 |
|
robb_nl joined #gluster |
15:59 |
|
Slashman joined #gluster |
15:59 |
|
Slashman joined #gluster |
16:01 |
|
7GHAABPAU joined #gluster |
16:02 |
|
theron joined #gluster |
16:03 |
|
dpaz joined #gluster |
16:05 |
dpaz |
hi guys , I have a 3 node gluster setup and I was wondering if I need to configure any fencing agent and if there's any guide for that |
16:09 |
dpaz |
actually I'm sure I need it , but is there anything integrated with gluster |
16:19 |
|
ovaistariq joined #gluster |
16:21 |
kkeithley |
dpaz: There are pacemaker resource agents for gluster in the glusterfs-resource-agents RPMs for Fedora/RHEL/CentOS. They're there too in the SuSE RPMs and Debian/Ubuntu .debs. |
16:27 |
dpaz |
kkeithley: thanks! |
16:29 |
|
theron joined #gluster |
16:31 |
|
merp_ joined #gluster |
16:33 |
|
jiffin joined #gluster |
16:34 |
|
voobscout joined #gluster |
16:35 |
|
kanagaraj joined #gluster |
16:35 |
|
dnoland1 joined #gluster |
16:35 |
|
rafi joined #gluster |
16:38 |
dnoland1 |
our gluster nfs share is allowing us to run commands like $(touch some_new_file), but if we run $(echo words > some_new_file) we get this: some_new_file: Read-only file system |
16:39 |
dnoland1 |
We also *can* do this: $(touch new_file && echo words > new_file) |
16:40 |
dnoland1 |
It is just writing data on a new file that is causing problems. Any thoughts? |
16:40 |
JoeJulian |
My first assumption is that you have quorum enabled and have lost quorum. |
16:41 |
dnoland1 |
I do have log errors to that effect. That said, I have the following quorum settings: |
16:41 |
dnoland1 |
cluster.quorum-type (none) |
16:41 |
|
shubhendu joined #gluster |
16:41 |
dnoland1 |
cluster.quorum-count (null) |
16:42 |
|
wolsen joined #gluster |
16:42 |
dnoland1 |
that is just for our nfs share |
16:42 |
dnoland1 |
The remainder of our gluster system is working fine |
16:42 |
|
neofob joined #gluster |
16:43 |
|
bfm joined #gluster |
16:43 |
dnoland1 |
I am seeing this in the nfs log |
16:43 |
dnoland1 |
[2016-02-22 16:37:55.253163] W [MSGID: 114031] [client-rpc-fops.c:2402:client3_3_create_cbk] 0-home-client-13: remote operation failed. Path: /d/dano2364/some_random_file [Transport endpoint is not connected] |
16:43 |
dnoland1 |
[2016-02-22 16:37:55.299597] W [MSGID: 108001] [afr-transaction.c:686:afr_handle_quorum] 0-home-replicate-4: /d/dano2364/some_random_file: Failing CREATE as quorum is not met |
16:43 |
|
voobscout joined #gluster |
16:44 |
dnoland1 |
But all three of our gluster servers can see each other (based on this command): sudo gluster peer status |
16:44 |
dnoland1 |
Number of Peers: 2 |
16:44 |
dnoland1 |
Hostname: ss-62 |
16:44 |
dnoland1 |
Uuid: 752c9501-dea7-467c-91ec-2e942df2d86c |
16:44 |
dnoland1 |
State: Peer in Cluster (Connected) |
16:44 |
dnoland1 |
Hostname: ss-61 |
16:44 |
dnoland1 |
Uuid: bbab75b5-77a0-4752-b410-054844184137 |
16:44 |
dnoland1 |
State: Peer in Cluster (Connected) |
16:44 |
JoeJulian |
@paste |
16:44 |
glusterbot |
JoeJulian: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999 |
16:44 |
dnoland1 |
sorry, will use pbin |
16:45 |
dnoland1 |
http://sprunge.us/BKhg |
16:46 |
JoeJulian |
I guess we should look at 'gluster volume status home' |
16:46 |
bfm |
Hi Guys! I have geo-replication working between two clusters, but recently I noticed hidden files appearing on the geo-rep slaves. Say, I have dir/file1.txt at the source and at some stage on the geo-rep slave I see dir/.file1.txt.VA5U1q of zero size. It looks to me that some sort of geo-rep hiccup is happening here, but I can't track it down through the logs :-( |
16:47 |
JoeJulian |
That looks like an rsync tempfile. |
16:47 |
dnoland1 |
http://sprunge.us/NFBj |
16:49 |
bfm |
@JoeJulian, trouble is that those files do not get cleaned up |
16:55 |
JoeJulian |
bfm: Are you sure the geo-rep of those files is complete? My assumption would be that they would stick around so they could be continued if it's not. |
16:56 |
JoeJulian |
If they are, I would file a bug report about that. |
16:56 |
glusterbot |
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS |
16:57 |
JoeJulian |
dnoland1: You do have some services that are not running. That shouldn't cause a loss of quorum when none is defined though. |
16:59 |
|
theron joined #gluster |
17:00 |
JoeJulian |
dnoland1: Try setting quorum.count to 0 instead of null. |
17:00 |
dnoland1 |
JoeJulian: Thank you. I am not sure why Self-heal Daemon is not running on ss-61 |
17:00 |
dnoland1 |
And I will try that |
17:01 |
|
haomaiwang joined #gluster |
17:02 |
bfm |
JoeJulian: it looks like this for example on geo-rep slave: |
17:02 |
bfm |
-rw-r--r-- 1 2001 2001 98174969 Feb 22 16:30 file.20160218.0.379.noarch.rpm |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 14:48 .file.20160218.0.379.noarch.rpm.3VVTvo |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 15:18 .file.20160218.0.379.noarch.rpm.6g6KYl |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 15:36 .file.20160218.0.379.noarch.rpm.BQlZEt |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 14:07 .file.20160218.0.379.noarch.rpm.DtVws4 |
17:02 |
glusterbot |
bfm: -rw-r--r's karma is now -19 |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 14:36 .file.20160218.0.379.noarch.rpm.HKGIVU |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 16:48 .file.20160218.0.379.noarch.rpm.KWHKOB |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 16:18 .file.20160218.0.379.noarch.rpm.LCvi6L |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -6 |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 16:06 .file.20160218.0.379.noarch.rpm.TFZt8Z |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 14:18 .file.20160218.0.379.noarch.rpm.TZ3JJZ |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -7 |
17:02 |
bfm |
-rw------- 0 root repluser 0 Feb 22 15:06 .file.20160218.0.379.noarch.rpm.UOQTZt |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -8 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -9 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -10 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -11 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -12 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -13 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -14 |
17:02 |
glusterbot |
bfm: -rw-----'s karma is now -15 |
17:02 |
JoeJulian |
@paste |
17:02 |
glusterbot |
JoeJulian: For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999 |
17:03 |
bfm |
sorry! |
17:04 |
|
ivan_rossi left #gluster |
17:05 |
dnoland1 |
JoeJulian: I tried to set quorum.count to zero, http://sprunge.us/fCOc |
17:06 |
JoeJulian |
So I assume you pasted that, bfm, because you would like me to confirm my previous advice. I still recommend you file a bug report. |
17:06 |
glusterbot |
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS |
17:09 |
JoeJulian |
dnoland1: Aha... "/* If user doesn't configure anything enable auto-quorum if the replica has odd number of subvolumes */" |
17:10 |
JoeJulian |
dnoland1: so you would have to set quorum-type to "none" to disable that. |
17:10 |
|
rcampbel3 joined #gluster |
17:11 |
JoeJulian |
Which still doesn't explain why you're hitting quorum issues with all your bricks online. |
17:11 |
dnoland1 |
JoeJulian: ok, so I should run $(sudo gluster volume set home cluster.quorum-type none) on my gluster daemons |
17:11 |
dnoland1 |
? |
17:11 |
JoeJulian |
... unless there are firewall issues. |
17:11 |
dnoland1 |
Ok, will try that. I appreciate your help :) |
17:11 |
JoeJulian |
You would only have to set it on one. |
17:11 |
dnoland1 |
k |
17:12 |
JoeJulian |
The changes made through the cli are cluster-wide. |
17:13 |
|
bluenemo joined #gluster |
17:19 |
dnoland1 |
Ok, change made. |
17:19 |
dnoland1 |
Is is possible that I am hitting quorum issues just because the self heal daemon on ss-61 is offline (for reasons I don't understand) |
17:20 |
|
calavera joined #gluster |
17:20 |
JoeJulian |
No |
17:21 |
JoeJulian |
It's because the nfs service isn't connected to all the bricks in the replica subvolume. |
17:21 |
JoeJulian |
0-home-replicate-4 |
17:26 |
|
bennyturns joined #gluster |
17:29 |
|
jri joined #gluster |
17:30 |
bfm |
JoeJulian: http://termbin.com/8s6n |
17:30 |
bfm |
that's on the geo-rep slave |
17:31 |
|
plarsen joined #gluster |
17:31 |
dnoland1 |
JoeJulian: I see. The nfs server on ss-61 is off for some reason. Is there a more proper way to restart it than using systemd to restart glusterd? |
17:32 |
|
edong23 joined #gluster |
17:35 |
|
ahino joined #gluster |
17:38 |
|
theron joined #gluster |
17:41 |
JoeJulian |
dnoland1: Another way is "gluster volume start $volname force" |
17:45 |
|
merp_ joined #gluster |
17:48 |
dnoland1 |
That resolved the issue |
17:48 |
dnoland1 |
Thank you JoeJulian. I really appreciate your help |
17:49 |
dnoland1 |
(just inherited this position from my boss, and knew nothing about gluster until just a little while ago, so your help really saved me) |
17:49 |
|
ahino joined #gluster |
17:49 |
JoeJulian |
You're welcome. Keep an eye on that. I suspect there's a reason those were not running. |
17:49 |
JoeJulian |
With that many bricks, it might have been oom purging. |
17:52 |
dnoland1 |
Will do. Setting up icinga to monitor nfs daemon now |
18:01 |
|
haomaiwa_ joined #gluster |
18:02 |
|
hchiramm joined #gluster |
18:09 |
|
theron joined #gluster |
18:16 |
|
Manikandan joined #gluster |
18:17 |
|
voobscout joined #gluster |
18:21 |
|
jbrooks joined #gluster |
18:29 |
|
NuxRo joined #gluster |
18:42 |
|
nishanth joined #gluster |
18:43 |
|
voobscout joined #gluster |
18:44 |
|
amye joined #gluster |
18:45 |
|
calavera joined #gluster |
18:54 |
|
Melamo joined #gluster |
18:55 |
|
Melamo left #gluster |
19:00 |
|
ahino joined #gluster |
19:01 |
|
haomaiwa_ joined #gluster |
19:03 |
|
calavera joined #gluster |
19:07 |
s-hell |
Hello everyone! |
19:08 |
s-hell |
Can anyone help me with this error: https://paste.pcspinnt.de/view/raw/f19c97d2 |
19:11 |
|
jobewan joined #gluster |
19:12 |
|
mtanner joined #gluster |
19:36 |
JoeJulian |
s-hell: Did you actually define the geosync master as localhost? |
19:44 |
s-hell |
hm, no. don't think so. |
19:44 |
s-hell |
wait a second... |
19:44 |
s-hell |
no, master node is set to hostname |
19:45 |
s-hell |
i've used the georepsetup tool |
19:46 |
s-hell |
JoeJulian: georepsetup pimages geouser www1.ambiendo.ovh pimages |
19:46 |
|
theron joined #gluster |
19:47 |
|
theron joined #gluster |
19:48 |
|
ovaistariq joined #gluster |
19:49 |
|
calavera joined #gluster |
19:49 |
s-hell |
there is no other error message only this error. |
19:52 |
|
cliluw joined #gluster |
20:01 |
|
haomaiwang joined #gluster |
20:05 |
Trefex |
hi all. I have a 3-node distributed setup, and find it impossible to load my data into the cluster since many months |
20:05 |
Trefex |
i have tried mounting directly, rsync to an rsyncd, nfs, smbd |
20:06 |
Trefex |
after a while i get timeouts and the rsync process fails |
20:06 |
Trefex |
i have many small files, and i am using rsync so that i can resume, but I can't get my dataset in |
20:06 |
Trefex |
any ideas? |
20:07 |
|
deniszh joined #gluster |
20:12 |
s-hell |
got it. it was a wrong mountbroker configuration |
20:15 |
JoeJulian |
s-hell: Nice, glad you got it figured out. |
20:16 |
JoeJulian |
Trefex: Check your client logs for clues. Try disabling performance translators one-at-a-time and see if that helps. See if there's something loading up on your servers that's causing them not to respond. |
20:17 |
Trefex |
JoeJulian: performance translators? |
20:17 |
JoeJulian |
gluster volume set help | grep performance |
20:17 |
Trefex |
JoeJulian: i guess it's due to a long rsync process |
20:18 |
|
voobscout joined #gluster |
20:18 |
JoeJulian |
gluster shouldn't care. |
20:18 |
JoeJulian |
It's just a bunch of writes to an open file descriptor. |
20:19 |
Trefex |
JoeJulian: this is my current config http://ur1.ca/ok8qd |
20:19 |
glusterbot |
Title: #327447 Fedora Project Pastebin (at ur1.ca) |
20:19 |
Trefex |
JoeJulian: do you see anything fishy ? |
20:20 |
JoeJulian |
"volume info" is more useful since that only shows the things that have been changed from default. |
20:20 |
|
Kins joined #gluster |
20:20 |
|
telmich joined #gluster |
20:20 |
|
klaas joined #gluster |
20:20 |
|
partner joined #gluster |
20:20 |
|
the-me joined #gluster |
20:20 |
|
javi404 joined #gluster |
20:20 |
|
_nixpanic joined #gluster |
20:20 |
|
frakt joined #gluster |
20:20 |
|
renout_away joined #gluster |
20:20 |
|
zerick joined #gluster |
20:20 |
|
cuqa_ joined #gluster |
20:20 |
|
xavih joined #gluster |
20:20 |
|
ron-slc joined #gluster |
20:20 |
|
csaba joined #gluster |
20:20 |
JoeJulian |
But even then, there's no setting that should *make* it timeout. |
20:20 |
|
ws2k3_ joined #gluster |
20:20 |
|
tru_tru joined #gluster |
20:20 |
|
_nixpanic joined #gluster |
20:20 |
|
dastar joined #gluster |
20:21 |
|
Iouns joined #gluster |
20:21 |
|
s-hell joined #gluster |
20:21 |
|
inodb joined #gluster |
20:21 |
|
malevolent joined #gluster |
20:21 |
Trefex |
JoeJulian: didn't know that http://ur1.ca/ok8qq |
20:21 |
|
kenansulayman joined #gluster |
20:21 |
|
bhuddah joined #gluster |
20:21 |
glusterbot |
Title: #327451 Fedora Project Pastebin (at ur1.ca) |
20:21 |
|
lkoranda joined #gluster |
20:21 |
|
kenansulayman joined #gluster |
20:21 |
Trefex |
also just for funsies, on another setup, i have 1 rsync to NFS over ZFS and 1 rsync of same data to GlusterFS |
20:21 |
Trefex |
one takes 40 mins, the other 7 hours |
20:22 |
|
dblack joined #gluster |
20:22 |
Trefex |
which I read is normal for Gluster, so that's not cool :( |
20:22 |
JoeJulian |
Well changing the log-levels to prevent seeing what might be a problem may make it more difficult for you to diagnose. |
20:22 |
Trefex |
ow |
20:22 |
Trefex |
client or brick ? |
20:22 |
|
scuttle` joined #gluster |
20:23 |
JoeJulian |
Maybe both. Depends. You're trying to diagnose a problem. Information helps with that. |
20:23 |
|
wistof joined #gluster |
20:23 |
Trefex |
gotcha, i think it's simply speed, gluster is crapp with small files, but i didn't find a better alternative :) |
20:24 |
JoeJulian |
cluster.data-self-heal-algorithm: full shouldn't come in to play here but do you have a reason you want it set that way? |
20:24 |
Trefex |
JoeJulian: what could be a reason? I took over this setup and got no handover |
20:24 |
Trefex |
so not sure why the options were set that way |
20:24 |
|
_fortis joined #gluster |
20:25 |
|
samikshan joined #gluster |
20:25 |
JoeJulian |
Heh, in fact, there's no reason at all. I just noticed this isn't replicated so that's never used. |
20:26 |
virusuy |
Hi all! . Im setting quotas to a folder with some data on it and the "used available" column in 'quota list' doesn't seems to be reflecting the reallity ( i'm on 3.6.x ) |
20:26 |
Trefex |
JoeJulian: which is something i might change, because right now, when a disk breaks, i have to take down whole cluster |
20:26 |
virusuy |
i mean, in that folder are like 2TB of data, and quota say it's only using 500G |
20:26 |
JoeJulian |
virusuy: search the mailing list. I saw mention of that there recently. |
20:26 |
virusuy |
JoeJulian: gotcha! |
20:27 |
|
marlinc joined #gluster |
20:28 |
JoeJulian |
Trefex: With your rsync, are you using --inplace? If not, you're going to be adding some extra dereferences to a lookup. |
20:28 |
|
john51 joined #gluster |
20:28 |
Trefex |
JoeJulian: ya |
20:28 |
JoeJulian |
dedup with zfs? |
20:28 |
Trefex |
rsync -RraWHvzP --timeout=3000 --inplace --delete hcs/ /mnt/tmpMounts/hcs is what i use right now |
20:30 |
Trefex |
JoeJulian: now client debug log files is filled with "on live-client-1 returned error [Stale file handle] |
20:30 |
Trefex |
JoeJulian: http://ur1.ca/ok8rq is the ZFS setup of one of the nodes |
20:30 |
glusterbot |
Title: #327459 Fedora Project Pastebin (at ur1.ca) |
20:32 |
Trefex |
JoeJulian: basically it's off |
20:41 |
virusuy |
JoeJulian: seems like this quota issue is on 3.6.x and the only workaround is update to 3.7 |
20:41 |
virusuy |
JoeJulian: thanks for the heads up |
20:42 |
JoeJulian |
You're welcome. |
20:42 |
Trefex |
JoeJulian: why does it say this 0-rpc-clnt: submitted request (XID: 0x29e3f8b Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (live-client-2) |
20:42 |
Trefex |
even though i'm using Gluster 3.7.8 ? |
20:42 |
JoeJulian |
The rpc version is 330. |
20:42 |
Trefex |
oh i see |
20:44 |
|
steveeJ joined #gluster |
20:48 |
|
amye joined #gluster |
20:51 |
|
ovaistariq joined #gluster |
20:55 |
|
calavera joined #gluster |
21:01 |
|
haomaiwang joined #gluster |
21:04 |
|
theron joined #gluster |
21:13 |
|
farhorizon joined #gluster |
21:20 |
|
anoopcs joined #gluster |
21:24 |
|
cuqa_ joined #gluster |
21:30 |
|
jri joined #gluster |
21:34 |
|
deniszh joined #gluster |
21:41 |
cpetersen_ |
JoeJulian: If I were to simulate another failure and did get a momentary split-brain again, in that moment, what logs would you like me to pull? And, if I do that would you be able to take a peek? |
21:44 |
JoeJulian |
cpetersen_: The client log from ganesha (both the pre-ip-move, and post-ip-move clients), the brick logs, self-heal logs and, most importantly, 'getfattr -m . -d -e hex' for the file on the servers reporting split-brain. |
21:45 |
|
tessier joined #gluster |
21:45 |
tessier |
Hello all! I just broke my cluster. :( I accidentally mounted something else over the gluster mountpoint on the brick nodes. |
21:45 |
tessier |
Feb 22 14:09:50 disk10 gluster-j-brick[4388]: [2016-02-22 22:09:50.444248] M [posix-helpers.c:1718:posix_health_check_thread_proc] 0-9j-posix: health-check failed, going down |
21:46 |
tessier |
How do I fix this? I've restarted gluster.... |
21:47 |
tessier |
Ah....phew. |
21:47 |
tessier |
I restarted everything again and it came back up. |
21:47 |
tessier |
That was stupid. Let's not do that again. |
21:47 |
JoeJulian |
Assuming you fixed the mount, "gluster volume start $volname force" would have done it. |
21:53 |
|
deniszh1 joined #gluster |
21:54 |
cpetersen_ |
Hmmm... client log, trying to remember where that is. |
21:55 |
cpetersen_ |
I think it's ganesha-gfapi.log but there are really only two so that must be it. |
21:59 |
|
Philambdo1 joined #gluster |
21:59 |
JoeJulian |
seems like a good guess. |
22:01 |
|
64MAADRUZ joined #gluster |
22:02 |
|
Philambdo1 joined #gluster |
22:02 |
cpetersen_ |
JoeJulian: http://paste.fedoraproject.org/327499/14561784/ |
22:02 |
glusterbot |
Title: #327499 Fedora Project Pastebin (at paste.fedoraproject.org) |
22:03 |
cpetersen_ |
Don't have to run "getfattr -m . -d -e hex" as the files are know nright? |
22:03 |
|
NuxRo joined #gluster |
22:03 |
JoeJulian |
Have to run those since we want to know *why* gluster thinks they're split-brain. |
22:03 |
cpetersen_ |
Ah ok, will do. |
22:04 |
cpetersen_ |
Well. |
22:04 |
cpetersen_ |
It doesn't think that anymore |
22:04 |
JoeJulian |
Oh, wait... It's only complaining about the volume root? |
22:04 |
cpetersen_ |
It's not in split-brain anymore and the VM failed over appropriately. |
22:04 |
cpetersen_ |
:P |
22:05 |
|
ParsectiX joined #gluster |
22:05 |
cpetersen_ |
Yes - apparently JoeJulian. |
22:05 |
JoeJulian |
Is it only lines 10 and 43 that you're concerned with? |
22:05 |
cpetersen_ |
Correct. |
22:05 |
cpetersen_ |
In this instance, at least. |
22:06 |
JoeJulian |
I've never seen a split-brain volume root, but even if I had, I don't think it would be a problem. |
22:07 |
JoeJulian |
Maybe I'll force one and see if it breaks anything. |
22:07 |
|
ovaistariq joined #gluster |
22:08 |
|
amye joined #gluster |
22:12 |
cpetersen_ |
Also, should be upgrading gluster from 3.7.6? |
22:14 |
|
deniszh joined #gluster |
22:14 |
JoeJulian |
I would (did). |
22:16 |
cpetersen_ |
Any issues? |
22:16 |
cpetersen_ |
Benefits? |
22:19 |
JoeJulian |
The only issues (which didn't effect my use case) is some performance issue with performance.write-behind ( bug 1309462 ). |
22:19 |
glusterbot |
Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1309462 low, unspecified, ---, bugs, NEW , Upgrade from 3.7.6 to 3.7.8 causes massive drop in write performance. Fresh install of 3.7.8 also has low write performance |
22:20 |
JoeJulian |
Benefits: a very large number of memory leaks fixed thanks to post-factum. |
22:20 |
cpetersen_ |
What causes the problem? Having that feature enabled? I don't think I have it enabled. |
22:20 |
JoeJulian |
It's enabled by default. |
22:20 |
cpetersen_ |
OIC. |
22:21 |
|
cyberbootje joined #gluster |
22:26 |
|
ovaistariq joined #gluster |
22:26 |
cpetersen_ |
Holy crap, ganesha-gfapi.log is 200 MB... |
22:30 |
cpetersen_ |
It's very interesting. The files aren't locked, but my VM just will not start up |
22:30 |
cpetersen_ |
It's failed over to another host, so HA works fine, but the VM will just not start up |
22:30 |
cpetersen_ |
If I bring the original host back up, it does start up fine |
22:32 |
JoeJulian |
define "will not start up" |
22:33 |
cpetersen_ |
The VMware machine posts, but following that it's just black. |
22:33 |
cpetersen_ |
As if the VM nvram is not accessible. |
22:34 |
JoeJulian |
Anything in that 200MB log when that happens? |
22:38 |
cpetersen_ |
eh oh |
22:38 |
|
kovshenin joined #gluster |
22:39 |
cpetersen_ |
http://ur1.ca/ok98l |
22:39 |
glusterbot |
Title: #327507 Fedora Project Pastebin (at ur1.ca) |
22:39 |
post-factum |
JoeJulian: that's why I've cherry-picked memleak-related fixes on top of 3.7.6: https://github.com/pfactum/glusterfs/commits/fixes-3.7.6 |
22:39 |
glusterbot |
Title: Commits · pfactum/glusterfs · GitHub (at github.com) |
22:40 |
post-factum |
but that is sad, I want my fresh shiny bug-free 3.7.9! |
22:40 |
cpetersen_ |
DIR01 is the affected VM, FYI. |
22:41 |
JoeJulian |
Show me a piece of software with no known bugs, and I'll show you a piece of software that isn't used. |
22:41 |
cpetersen_ |
I don't like those errors. lol |
22:42 |
post-factum |
JoeJulian: saw some related joke on that, like if the app's size approaches 0, the amount of debug efforts also approaches 0, so zero-sized app tends to be bug-free |
22:42 |
JoeJulian |
Someone will still complain about it though. |
22:43 |
cpetersen_ |
:P |
22:43 |
post-factum |
yep, size == 0 is edge case, so definitely it will trigger bugs in other apps ;) |
22:43 |
|
dnoland1 left #gluster |
22:44 |
|
kenhui joined #gluster |
22:44 |
|
amye joined #gluster |
22:44 |
JoeJulian |
hmm, "event generation 6" is new... must be an afr2 thing. |
22:46 |
JoeJulian |
Ah, looks like it might be an invalid interpretation. A comment from afr_read_subvol_select_by_policy when it returns -1 reads, "no readable subvolumes, either split brain or all subvols down". So it may not be split-brain, but rather all subvolumes may be down. |
22:48 |
JoeJulian |
So why is it losing connection to all subvolumes? Network? |
22:49 |
JoeJulian |
And I really wish they would enforce either spaces or tabs in the source. I hate trying to read code that's indented every which way. |
22:50 |
cpetersen_ |
In specific, where would the connection be dropping? |
22:50 |
cpetersen_ |
I did kill one of the bricks, just not the one serving the ganesha nfs share. |
22:51 |
* JoeJulian |
slaps cpetersen_ with a wet trout for throwing in extra variables during diagnostics. |
22:53 |
cpetersen_ |
?!?!?! |
22:53 |
cpetersen_ |
I didn't throw in extra variables. I killed an appliance. I went and hard shutdown the appliance to test failure. |
22:54 |
cpetersen_ |
The VMs are both on the that host, brick2 and the Windows VM that needs to failover to host 1. |
22:54 |
cpetersen_ |
=) |
22:54 |
cpetersen_ |
I'm hyper-converged, remember? |
22:56 |
cpetersen_ |
The other VM is affected as well. |
22:57 |
cpetersen_ |
Strange part is that the other VM, ACS01, is located on host 1 which has brick 1 on it. Neither that host nor the nfs share was affected.. |
22:57 |
cpetersen_ |
But that VM will not boot either. |
22:57 |
JoeJulian |
Ok, I misunderstood "I did kill one of the bricks" to mean one *other* brick. |
22:57 |
cpetersen_ |
Correct. |
22:58 |
cpetersen_ |
Bricks 1, 2 and 3. 1 has the NFS share primary VIP that I am consuming. I killed host 2 which took down brick 2 and initiated a VMware HA failover of DIR01 to host 3. |
22:59 |
cpetersen_ |
The NFS share was not affected, nor the files in an adverse way because if I boot host 2 up again, the VM will start up just fine. |
22:59 |
cpetersen_ |
I am perplexed. |
22:59 |
|
mobaer joined #gluster |
22:59 |
JoeJulian |
Right, so you should have retained quorum and ganesha should have had active connections to 1 and 3 still. |
22:59 |
cpetersen_ |
Correct, and it does. |
22:59 |
JoeJulian |
Not according to that log. |
23:00 |
cpetersen_ |
I have server quorum set to server and volume quorum set to auto. |
23:00 |
cpetersen_ |
Should I change volume quorum to 2? |
23:00 |
JoeJulian |
It doesn't go back to the beginning so I can't see if it ever had a connection. |
23:00 |
cpetersen_ |
Actually, to 1. |
23:00 |
JoeJulian |
No, auto is fine. |
23:00 |
cpetersen_ |
Here are the complete logs. |
23:00 |
cpetersen_ |
http://filebin.ca/2XtFzEB952Dt/file03logs.7z |
23:01 |
cpetersen_ |
Thank God for compression... |
23:01 |
|
haomaiwang joined #gluster |
23:06 |
|
theron joined #gluster |
23:06 |
JoeJulian |
cpetersen_: let me see volume info |
23:08 |
cpetersen_ |
"gluster v info": http://ur1.ca/ok9cc |
23:08 |
glusterbot |
Title: #327511 Fedora Project Pastebin (at ur1.ca) |
23:11 |
cpetersen_ |
:D |
23:11 |
cpetersen_ |
I felt like a real moron when I had my bricks mounted previously under the /run/gluster/shared_storage folder ... gah |
23:13 |
cpetersen_ |
But let's not talk about that shall we >.< |
23:17 |
JoeJulian |
:D |
23:18 |
|
HugHern_ joined #gluster |
23:18 |
cpetersen_ |
root, which is the owner, has RW on the files in the ESXi datastore |
23:18 |
cpetersen_ |
so there are no locks present that I can see |
23:20 |
cpetersen_ |
Nothing that VMware doesn't do natively as per the norm that is. ie, *.vmx.lck file. |
23:21 |
JoeJulian |
No, this totally looks like either a network condition, or a race. |
23:22 |
JoeJulian |
I'm just not completely sure what's supposed to be happening in some of these bits. |
23:22 |
JoeJulian |
Or why there's any logs from dht. |
23:24 |
JoeJulian |
The one thing I'm 99% sure of is that there's no split-brain happening. |
23:24 |
JoeJulian |
It's simply failing to pick a read-subvolume. |
23:25 |
cpetersen_ |
So gluster is struggling to pick a brick to pull from? |
23:25 |
cpetersen_ |
So then presumably, ganesha is working fine |
23:26 |
cpetersen_ |
NFS 3 is doing the job, but the gluster client is struggling |
23:27 |
JoeJulian |
That's the way I'm interpreting this. |
23:27 |
JoeJulian |
Do you have an open bug report? |
23:27 |
cpetersen_ |
Why would the VM fail to start though? ESXi can see and list all of the files on the share... |
23:27 |
cpetersen_ |
Not for this no, I don't feel I've identified a culprit yet |
23:28 |
JoeJulian |
At the moment it tries, it cannot read the nvram file |
23:28 |
JoeJulian |
[2016-02-19 20:58:35.632138] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-SHARED_vmvol01-client-2: remote operation failed. Path: /DIR01/DIR01.nvram (2a03a3c2-c444-46a5-b754-41b4f70d27ed) [No such file or directory] |
23:29 |
cpetersen_ |
Right... |
23:29 |
cpetersen_ |
Makes sense. |
23:29 |
tessier |
JoeJulian: Thanks for the tip on "gluster volume start $volname force". Duly noted. |
23:30 |
|
chromatin joined #gluster |
23:30 |
JoeJulian |
cpetersen_: file a bug. Include those logs and volume info. Describe the steps to create the failure. |
23:30 |
glusterbot |
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS |
23:30 |
JoeJulian |
If I see anything missing, I'll add my 2c. |
23:35 |
cpetersen_ |
JoeJulian: The error you posted there didn't occur today when I simulated the failure. |
23:35 |
cpetersen_ |
There were no remote operation failed messages today actually. |
23:35 |
cpetersen_ |
Well, in relation to storage volumes, hold up I may be lysing |
23:36 |
cpetersen_ |
"0-SHARED_vmvol01-replicate-0: Unreadable subvolume -1 found with event generation 6 for gfid d72a3396-f392-404d-91b7-1f1608cd61be. (Possible split-brain)" |
23:36 |
cpetersen_ |
"0-SHARED_vmvol01-dht: <gfid:d72a3396-f392-404d-91b7-1f1608cd61be>: failed to lookup the file on SHARED_vmvol01-dht [Stale file handle]" |
23:36 |
cpetersen_ |
These are the ones we are concerned about now, no? |
23:39 |
|
ovaistariq joined #gluster |
23:40 |
|
kenhui joined #gluster |
23:41 |
cpetersen_ |
What is <brick>-dht? |
23:41 |
|
arcolife joined #gluster |
23:42 |
cpetersen_ |
Well nevermind, found your article. |
23:42 |
cpetersen_ |
=) |
23:45 |
|
kovshenin joined #gluster |