IETF
Codesprint
codesprint@jabber.ietf.org
Monday, March 7, 2016< ^ >
RjS has set the subject to: Codesprint
Room Configuration
Room Occupants

GMT+0
[17:38:55] rjsparks@nostrum.com joins the room
[17:40:12] glen joins the room
[17:40:16] <glen> Hello?
[17:40:28] henrik joins the room
[17:40:29] <rjsparks@nostrum.com> seeing if I can get henrik too
[17:40:35] <glen> Good day!
[17:40:37] <rjsparks@nostrum.com> yes - we seem to be here
[17:40:41] <henrik> hi glen
[17:40:46] <glen> Henrik, sorry to bug you, but ghostlink is down and will not start.
[17:40:50] <henrik> I saw your messages — didn't you see mine?
[17:40:52] <glen> I has a traceback in the log
[17:41:02] <glen> All I got from you is a warning about me not having an OTP Plugin of some kind.
[17:41:15] <henrik> ugh
[17:41:16] <glen> Sorry
[17:43:06] <rjsparks@nostrum.com> (I assume communication is happening elsewhere?
[17:43:08] <rjsparks@nostrum.com> )
[17:48:16] <henrik> yes, sorry
[17:48:17] <glen> Sorry - Henrik popped a new window to me and I didn't even noticve it I'm swamped.
[17:48:18] <glen> Yes
[17:48:23] <henrik> wanted to make sure we could
[17:48:23] <glen> Henrik is checking it.
[17:48:29] <glen> THANK YOU BOTH!!!
[17:48:29] <rjsparks@nostrum.com> ok - all goo
[17:48:31] <rjsparks@nostrum.com> good even
[17:48:32] <henrik> so, glen,
[17:48:37] <henrik> no, hold on
[17:48:38] <glen> Yes sir!
[17:48:47] <rjsparks@nostrum.com> I'm going to go afk - if you need me in the next bit, please ping my cell?
[17:48:53] <henrik> sure
[17:49:36] <henrik> is the filesystem setup the same, or have you spread the logical filesystem over multiple partitions in a different way?
[17:52:33] <glen> Same setup
[17:52:35] <glen> All in /a
[17:52:37] <glen> /a/www
[17:52:45] <glen> symlink to /www
[17:55:38] <henrik> so, if I try to do '$ sudo su wwwrun' and variations thereof, it fails silently
[17:58:54] <henrik> and the log messages are saying something they should not, when I run it in the foreground: note the last 2 words of this message:
[17:59:06] <henrik> "ghostlinkd: Error: [Errno 1] Operation not permitted when trying to hard link /www/www6s/archive/id/draft-sfv-mpls-tp-fault-01.txt == /a/www/www6s/draft-archive/draft-sfv-mpls-tp-fault-01.txt as henrik"
[17:59:27] <glen> Checking hang on
[18:01:20] <glen> For some reason, user wwwrun had a shell on IETFA.
[18:01:29] <glen> That seems wrong, but it has /bin/fasle on IETFC.
[18:01:39] <henrik> ok
[18:01:42] <glen> I can change it, but ghostlink *was* running on IETFC
[18:01:46] <glen> without that change.
[18:01:59] <henrik> ok, then we don't change that at the moment
[18:02:07] <glen> # su wwwrun -c "ls -al /a/www"
total 700
drwxr-sr-x 25 root        root    4096 Jan 28 10:25 .
drwxr-xr-x 24 root        root    4096 Mar  4 02:55 ..
drwxr-sr-x  2 root        root    4096 Nov 13  2011 Group Data
drwxrwxr-x 52 glen        admin   4096 Jan 26 15:23 audio
-rw-r--r--  1 root        root      13 Jul 14  2015 devprocs
-rw-r--r--  1 root        root     514 Jan 28 10:25 devrsync
[18:02:15] <glen> Sorry
[18:02:16] <glen> su wwwrun -c "ls -al /a/www"
[18:02:19] <glen> That works...
[18:02:43] <henrik> however, even if I do $ sudo -u wwwrun /bin/bash, and then try to hardlink manually, it fails
[18:02:51] <glen> Hang on one sec
[18:02:59] <glen> My former command was run on IETFA.  :-(
[18:03:12] <glen> ietfc:/home/glen # su wwwrun -s /bin/bash -c "ls -al /a/www"
[18:03:53] <glen> In the absence of a shell, you have to specify -s /bin/bash to su.
[18:04:06] <henrik> yes, I tried that too ,:-)
[18:04:44] <glen> Did it fail for you?
[18:04:47] <glen> Command?
[18:04:52] <glen> As root I can:
[18:04:52] <henrik> Hmm, no, apparently not — at least now it works.
[18:04:57] <glen> Ok
[18:05:22] <henrik> but anyway, getting a wwwrun shell with sudo -u still results in permission issues
[18:05:37] <henrik> Quoting:
[18:05:39] <henrik> wwwrun@ietfc $ ln /a/www/www6s/draft-archive/draft-sfv-mpls-tp-fault-01.txt /www/www6s/archive/id/draft-sfv-mpls-tp-fault-01.txt
ln: failed to create hard link ‘/www/www6s/archive/id/draft-sfv-mpls-tp-fault-01.txt’ => ‘/a/www/www6s/draft-archive/draft-sfv-mpls-tp-fault-01.txt’: Operation not permitted
[18:05:59] <henrik> so it's a file system permission issue
[18:06:04] <henrik> or related
[18:06:14] <henrik> that's why I asked if there was different partitions
[18:06:36] <henrik> softlinking works
[18:06:42] <henrik> but not hardlinking
[18:07:32] <glen> I get:
[18:07:41] <glen> wwwrun@ietfc:/www/www6s> ln /a/www/www6s/draft-archive/draft-sfv-mpls-tp-fault-01.txt /www/www6s/archive/id/draft-sfv-mpls-tp-fault-01.txt
ln: failed to create hard link '/www/www6s/archive/id/draft-sfv-mpls-tp-fault-01.txt': File exists
[18:08:00] <glen> I see it cannot create new files hang on
[18:08:12] <henrik> that's probably because I just created a softlink ,:-)
[18:08:32] <glen> It has write permission on the driectory
[18:08:38] <glen> wwwrun@ietfc:/www/www6s/draft-archive> touch z
wwwrun@ietfc:/www/www6s/draft-archive> rm z
[18:09:05] <glen> AHJ!
[18:09:11] <glen> Not the destination
[18:09:12] <glen> Hang on
[18:09:23] <glen> Nope
[18:09:35] <glen> wwwrun@ietfc:/www/www6s/archive/id> touch z
wwwrun@ietfc:/www/www6s/archive/id> rm z
[18:09:47] <henrik> yes, I also tried touching something as the first check
[18:09:51] <glen> As user wwwrun I can write to both source and target
[18:09:52] <henrik> but I used foobar
[18:09:59] <glen> And I can symlink
[18:10:01] <glen> But not hardlink???
[18:10:04] <henrik> yes
[18:10:05] <henrik> yes
[18:10:09] <glen> I have NEVER seen that before.
[18:10:21] <glen> Ugh.
[18:10:41] <henrik> I've only seen that when the hardlink would cross partitions
[18:12:16] <glen> I cant even hardlink in the same directorty
[18:12:27] <glen> But I can symlink
[18:12:28] <glen> UGH
[18:12:30] <glen> Checking
[18:15:48] <henrik> you changed something?
[18:15:51] <glen> no
[18:15:54] <glen> Why
[18:15:55] <henrik> Hmm.
[18:15:57] <glen> Just testing
[18:16:02] <henrik> THings seem to work, now …
[18:16:02] <glen> I can do ln y z as root in /a/www
[18:16:31] <henrik> I can do touch foo; hardlink foo bar in archive/id/ now
[18:17:17] <glen> ghostlink stats
[18:17:20] <glen> starts
[18:17:45] <glen> Looks like a process wasn't initialized correctly.
[18:17:52] <henrik> ok, that whole thing was very enlightening ??!??
[18:17:57] <henrik> aha?
[18:18:01] <henrik> what did you find?
[18:18:03] <glen> I have no idea.
[18:18:10] <henrik> oh :-(
[18:18:30] <glen> It looks like ghostlink was doing a ton of linking
[18:18:40] <glen> IS doing
[18:19:08] <henrik> oh, yes, I'd expect it to, if it's not been running for a little while
[18:20:17] <henrik> hmm.  did it stop again?
[18:20:28] <glen> Yes.  Something is bounding.
[18:20:30] <glen> bouncing.
[18:20:35] <henrik> ok.
[18:20:55] <glen> 2016-03-07T10:20:01.322347-08:00 ietfc systemd[15743]: Starting Paths.
2016-03-07T10:20:01.323273-08:00 ietfc systemd[15743]: Reached target Paths.
2016-03-07T10:20:01.323932-08:00 ietfc systemd[15743]: Starting Timers.
2016-03-07T10:20:01.324504-08:00 ietfc systemd[15743]: Reached target Timers.
2016-03-07T10:20:01.325070-08:00 ietfc systemd[15743]: Starting Sockets.
2016-03-07T10:20:01.325650-08:00 ietfc systemd[15743]: Reached target Sockets.
2016-03-07T10:20:01.326206-08:00 ietfc systemd[15743]: Starting Basic System.
2016-03-07T10:20:01.326808-08:00 ietfc systemd[15743]: Reached target Basic System.
2016-03-07T10:20:01.327365-08:00 ietfc systemd[15743]: Starting Default.
2016-03-07T10:20:01.327921-08:00 ietfc systemd[15743]: Reached target Default.
2016-03-07T10:20:01.328471-08:00 ietfc systemd[15743]: Startup finished in 10ms.
[18:20:57] <henrik> Let me check if it tells us what it was doing when it gets an exception at that point.
[18:23:12] <glen> Apparently the above is "normal" for systend
[18:23:13] <glen> Ugh
[18:25:18] <henrik> Yes, so it was trying to set up the hardlink it announced on the line before the exception:
[18:25:37] <henrik> new hard link /www/www6s/archive/id/2006-05-10_1324-404.cgi == /a/www/www6s/tool-id-archive/2006-05-10_1324-404.cgi
[18:27:15] <glen> Could ghostlink itself be the cause here?
[18:27:18] <glen> I am still hunting.
[18:27:26] <glen> I would hate to have to fail back to the old server because of this.
[18:27:34] <henrik> Oh, no, we'll get it working
[18:27:47] <henrik> I see that the file on which it failed had different permissions
[18:27:57] <henrik> -r-xr-xr-x 1 mlarson admin 10966 2011-08-15 01:49 /a/www/www6s/tool-id-archive/2006-05-10_1324-404.cgi
[18:28:07] <glen> I was thinking that "Stopping Paths" was the problem - but Paths isn't stopping - systemd is just rewriting the old log entires every time a cron job runs.  :-(
[18:29:48] <henrik> Huh.  I don't particularly want to know even more about how systemd is silly ,,:-)  What I've heard and seen is enough.  However, we'll have to catch the exception, then move on, I think.
[18:31:21] <glen> I have two processes running right now - creating and destroying hardlinks in /a once per second.  One as root, one as wwwrun.
[18:31:28] <glen> Of course, it's all working at the moment.  :-(
[18:31:39] <glen> ghostlink is dead
[18:31:55] <glen> Did you kill it or shall I restart it?
[18:34:34] <glen> All of the strange log entries from systemd have not broken my hard links yet
[18:44:23] <glen> Did you just restart ghostlink?
[18:47:40] <henrik> I've been running it manually in the foreground, and after adding exception handling to get around files which cause problems, it nows seem to run ok.
[18:48:09] <henrik> I'm still doing some small tweaks, but if systemd kicks it off as a daemon, that's OK
[18:48:32] <glen> I've seen no failures to link/unlink on root or wwwrun - still running.
[18:48:38] <glen> Ugh that made me nervous though.
[18:48:49] <glen> I have it disabled in systemd
[18:48:54] <glen> So it will leave you alone until you tell me.
[18:49:06] <henrik> nono, go ahead and enable it
[18:49:29] <henrik> It looks as if we're ok now
[18:50:11] <henrik> I created a new patch copy: /usr/local/chare/ghostlinkd-0.40.2/
[18:50:26] <glen> Okay, same command to launch it?
[18:50:31] <henrik> yes
[18:50:35] <glen> Stand b
[18:51:24] <glen> Enabled and running
[18:51:38] <glen> Still no hardlink errors on my testers
[18:52:15] <glen> I am SO SORRY for the emergency call here.  :-(
[18:52:19] <henrik> nono
[18:52:39] <henrik> it seemed some code changes were needed, so don't worry about it
[18:52:48] <henrik> I'm doing $ tail -f /var/log/user.log  | grep --line-buffered ghostlink
[18:52:52] <glen> THANK YOU anyway!!!!! :-)  I am very grateful!
[18:52:54] <henrik> so I see what's going on
[18:52:58] <glen> I'm on tail -f /var/log/messages
[18:53:02] <glen> No grep at all  :-O
[18:53:08] <henrik> you're so absolutely welcome, Glen
[18:53:13] <henrik> hel :-)
[18:53:26] <glen> Okay, it appears we're okay.
[18:53:32] <glen> I'm halting my test processes.
[18:53:37] <henrik> ack
[18:54:03] <glen> I've reversed the backups so old ietfa and ietfb are now pulling from ietfc.
[18:54:09] <glen> I'm now going to enable replication of MySQL.
[18:54:13] <glen> Did you see ANtony's email about that?
[18:54:18] <henrik> yes
[18:54:56] <henrik> it seems we'll need to upgrade, in order to be able to replicate
[18:55:44] <henrik> meanwhile, could you try the setting Antony asked about?
[18:55:57] <glen> Absolutely.  WIll do that in just a minute.
[19:12:39] <glen> Henrik - If it's okay with you I'm going to get off IM - too many office people are pinging me.  May I please THANK YOU again for saving the day!  And I'll be working on replication with Antony next.
[19:13:39] glen leaves the room
[19:52:41] henrik leaves the room
[19:54:38] rjsparks@nostrum.com leaves the room
[22:47:23] RjS joins the room
[22:47:23] pck joins the room
[22:47:45] RjS-adium joins the room
[22:48:02] <RjS-adium> did you change stuff, or did it just start working?
[22:48:16] RjS leaves the room
[22:48:53] <pck> I didn't make a change