[01:07] ErKa (keryell@keryell.pck.nerim.net) left irc: Ping timeout: 480 seconds [02:04] kugg (~kugg@user184.77-105-210.netatonce.net) left irc: Quit: goodbye have a nice day [02:07] torkel (torkel@ip64.degernas.se) left irc: Ping timeout: 480 seconds [03:24] Barbarossa (~max@rfc2324.org) left irc: Ping timeout: 480 seconds [03:38] lazyb0y_ (~henning@v683.vanager.de) joined #fai. [03:38] lazyb0y (~henning@v683.vanager.de) left irc: Read error: Connection reset by peer [05:00] Lin (~igor@200.179.57.57) got netsplit. [05:00] alexanderwz (~alexander@karuna.med.harvard.edu) got netsplit. [05:01] Lin (~igor@200.179.57.57) returned to #fai. [05:01] alexanderwz (~alexander@karuna.med.harvard.edu) returned to #fai. [07:04] torkel (torkel@monsun.hpc2n.umu.se) joined #fai. [08:01] torkel (torkel@monsun.hpc2n.umu.se) left irc: Quit: leaving [08:01] torkel (torkel@ip64.degernas.se) joined #fai. [08:04] MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. [08:12] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [08:20] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving [08:49] ErKa (keryell@m2.wifi.enstb.org) joined #fai. [08:55] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [09:05] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds [09:08] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [09:33] Nick change: fai-guy -> fai-gay[slac] [09:43] allee (~ach@allee.mpe.mpg.de) joined #fai. [09:54] nocturn (~nocturn@d51A4378A.access.telenet.be) joined #fai. [09:54] Hi guys [09:54] My make-nfsroot is bailing out with an error: [09:54] I: Base system installed successfully. [09:54] Aborting [09:54] No diversion `any diversion of /sbin/discover-modprobe', none removed [09:55] I have this on two machines with Fai from Debian Etch [10:28] nocturn, could you please run make-fai-nfsroot -v [10:28] and paste the logs to paste.debian.net? [10:30] Running now... [10:39] Done, this is the 32-bit nfsroot [10:39] http://paste.debian.net/44281 [10:40] seems like debootstrap is failing [10:40] try the failing command yourself [10:41] chroot /export/fai/nfsroot-i386 dpkg --force-depends --install var/cache/apt/archives/libc6_2.3.6.ds1-13etch2_i386.deb [10:45] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds [10:49] MT: it failed [10:49] http://paste.debian.net/44284 [10:50] hmm, seems like the permissions of /dev/null are f** up [10:50] Strange... what could have caused this? [10:51] The nfsroot was empty before starting it [10:51] did you play any tricks on udev? [10:51] I guess udev should take care of such things ... [10:51] no, it"s a clean debian install... [10:51] oh wait... [10:52] I'm using libnss-ldap... [10:52] Which causes udev problems on Ubuntu, so maybe Debian etch suffers from this too... [10:52] hmm, I [10:52] I'm using libnss-ldap as well, without any problems [10:52] But which /dev/null is the problem, the one in the chroot or in the real fs. [10:53] MT: is your FAI server also the LDAP server? [10:53] no [10:53] for me it is. [10:54] just compare the output of ls -la /dev/null and ls -la /export/fai/nfsroot-i386/dev/null [10:54] you are using FAI 3.1.8, aren't you [10:54] permissions are identical [10:55] Fai 3.1.8 (from Debian), yes [10:56] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [10:57] that is, crw-rw-rw, root:root ? [10:57] crw-rw-rw- 1 root root 1, 3 [10:58] and you are running all this as user root, aren't you? [10:58] yes [10:58] is the root user also managed by LDAP, or is it the usual passwd [10:59] no, the root user is local [10:59] ok, try this [10:59] chroot /export/fai/nfsroot-i386 "echo x > /dev/null" [11:00] it says: chroot: cannot run command `echo x > /dev/null': No such file or directory [11:01] hmm, maybe the quotes aren't appropriate [11:01] use this [11:01] I also tried [11:01] chroot /export... [11:01] chroot /export/fai/nfsroot-i386 [11:01] echo x > /dev/null [11:01] exit [11:01] then in the chroot: echo x > [11:01] bash: /dev/null: Permission denied [11:01] ls -la /dev/null [11:01] crw-rw-rw- 1 root root 1, 3 May 21 2007 /dev/null [11:01] id [11:01] ? [11:02] uid=0(root) gid=0(root) groups=0(root) [11:02] ls -lan /dev/null [11:02] crw-rw-rw- 1 0 0 1, 3 May 21 2007 /dev/null [11:04] exit [11:04] mount [11:04] pls ... [11:04] could you paste that? [11:05] http://paste.debian.net/44286 [11:05] Thanks for helping me MT [11:05] found it? [11:05] it's the nodev ... [11:06] OK [11:06] I totally forgot about that [11:06] I guess nosuid is also undesireable on that one [11:07] Trying again... [11:09] It will run for a while, I'm popping out for lunch while it does. [11:09] Action: nocturn crosses fingers [11:34] MT (~MT@dove.informatik.tu-muenchen.de) left irc: Remote host closed the connection [11:34] MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. [12:08] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds [12:50] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [12:54] MT: It worked, the 32-bit was created correctly [12:54] Now also trying 64-bit [12:54] ok, cool! [12:59] On 64-bit, it stays on I: Unpacking libgdbm3... for a very, very long time... [13:02] I think it's dead... [13:02] Is there any way I can check if it's still running? [13:02] top? [13:02] ps? [13:03] top says 100% idle [13:03] I think NFS is hanging.. [13:04] Mrfai (~lange@kueppers.informatik.uni-koeln.de) joined #fai. [13:04] Ynfs: server fermi2 not responding, still trying [13:04] sure... [13:26] Ok, copied it to a local dir and amd64 was built correctly [13:26] Next step, trying to actually boot a node.. [13:50] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds [14:03] PXE boot doesn"t work [14:04] it says: Trying to load pxelinux.cfg/C0A80145 [14:04] Then Trying to load pxelinux.cfg/C0A8014 [14:04] and so on. [14:04] I checked on the server [14:04] pxelinux.cfg/C0A80145 is there [14:04] did you check the logs on the server? [14:06] syslog? [14:06] maybe daemon.log [14:06] you might need to enable -vv [14:06] to the tftpd flags [14:07] trying... [14:13] only this: in.tftpd[22260]: tftp: client does not accept options [14:13] in inetd: [14:13] tftp dgram udp wait root /usr/sbin/in.tftpd in.tftpd -vv -s /srv/tftp [14:13] is that correct? [14:14] seems so [14:14] try -vvv [14:14] and restart inetd, I think [14:14] (assuming that you use the version from inetd at all [14:16] yes, I'm using inetd [14:16] No change... [14:16] only the line I posted above... [14:17] nothing in syslog or daemon.log? [14:17] are you using tftpd-hpa? [14:18] yes [14:18] It was pulled in as a dependency of Fai-quickstart [14:18] ok, that's fine then [14:19] nocturn: did you check tftp? [14:19] oz_: what should I check? [14:19] tftp is running, it gets a connection from the node [14:19] but the node does not seem to find the FAI config file [14:19] yes, but you could use a tftp client to check whether the files are there [14:20] I think this is what oz_ suggested [14:20] yes, exactly [14:20] tftp localhost [14:20] anyway, here I'm seeing further messages in /var/log/daemon.log [14:20] the like on any other ftp client "get [14:20] " [14:21] Lin (~igor@200.179.57.57) left irc: Quit: Ex-Chat [14:21] tftp localhost connects [14:21] can you get the kernel, the config files? [14:23] I can get the config file [14:23] Though this action does not cause more logging than before [14:24] Maybe my dhcpd.conf file is missing some option? [14:24] subnet declaration has: server-name "fermi2"; filename "fai/pxelinux.0"; [14:24] } [14:24] nocturn: the "client does not accept options" make me think that it could be that you have two tftpds installed... [14:25] and that you are using the wrong one right now [14:25] oz_: my inetd has this line [14:25] what does "dpkg -l | grep tftp" say? [14:25] tftp dgram udp wait root /usr/sbin/in.tftpd in.tftpd -vvv -s /srv/tftp [14:25] ii tftp-hpa 0.43-1.1 HPA's tftp client [14:25] ii tftpd-hpa 0.43-1.1 HPA's tftp server [14:25] nothing more [14:25] ? [14:25] nope [14:26] oz_: does the inetd line look correct? [14:26] nocturn: please wait, I'll check it in a minute [14:27] what's your output of "dpkg -S /usr/sbin/in.tftpd"? [14:27] nocturn: looks quite okay [14:28] but I'd run it as a daemon, just for testing [14:31] tftpd-hpa: /usr/sbin/in.tftpd [14:32] oz_: just run /usr/sbin/in.tftpd -vvv -s /srv/tftp [14:33] hm. weird looks good. [14:33] still not working? [14:33] any other error messages (e.g. in daemon.log) [14:33] ? [14:33] I started the damon standalone (killed inetd) [14:33] waiting for the node to boot [14:33] takes a while (Dell firmware...) [14:34] do you haxewhat does " dpkg -l | grep pxe" tell you? [14:35] s/do\ you\ haxe// , sorry. [14:36] dpkg -l | grep pxe [14:36] nothing [14:36] The only message in daemon.log or syslog was about the client not accepting options [14:37] then install pxe [14:37] apt-get install pxe [14:37] done [14:37] that _should_ help. [14:37] Do I need to put it anywhere (like inetd.conf)? [14:37] ps aux | grep pxe shows a process? [14:38] nope [14:38] just install [14:38] 110 22440 0.0 0.0 3256 1004 ? S 15:37 0:00 /usr/sbin/pxe [14:38] Ok, rebooting the node [14:38] Nick change: MT -> Guest440 [14:38] Guest440 (~MT@dove.informatik.tu-muenchen.de) left irc: Read error: Connection reset by peer [14:38] MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. [14:42] no change.... [14:43] /var/log/pxe.log does exist? [14:43] tftp must be working, the node reports tftp prefix: fai [14:43] oz_: yes [14:43] what does it say? [14:43] contains: [14:43] Thu Dec 6 15:37:25 2007: Info: Sock::Open: Bound to address: 192.168.1.1 Port: 4011 [14:43] Thu Dec 6 15:37:25 2007: Info: Sock::JoinMulticast: Joined multicast group [14:43] nothing more [14:44] Mon Dec 3 16:33:51 2007: Info: Sock::Open: Bound to address: 192.168.1.50 Port: 4011 <- nothing like that? [14:44] maybe... [14:44] no [14:44] don't suspect me for the next sentence I say. [14:44] maybe you reboot the server. ;) [14:44] LOL [14:45] Tried that a couple of times already [14:45] Doesn't the error indicate that it can't find the file over tftp? [14:45] after pxe installation? [14:45] It reports PXELINUX when booting [14:45] ah, rebooting after pxe? No... [14:45] Will try that. [14:46] :-> [14:46] just to be sure that everything is started a new [14:46] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [14:46] I am not really sure what this pxething does, never traced it. [14:47] I really don't understand it.... It should work [14:47] it sees tftp, has the right prefix, so why does it not find the config files... [14:47] yes, as far as I can see now, it should. [14:48] nocturn: can you get the config files by tftp? [14:48] what about the permissions? [14:48] and...where did you set this prefix? [14:48] Action: oz_ never set any prefix [14:48] at last, you might want to try tcpdump [14:48] after all, it's nice plain-text thing ... [14:48] oz_: yes, I could get them manually [14:49] master is rebooted, booting node [14:49] I never used the pxe package. [14:49] I never set the prefix, but I guess it gets it from the dhcpd.conf [14:49] filename "fai/pxelinux.0"; [14:49] guessing is bad with computers... [14:49] one shoudl know...that's why I hate this "reboot" stuff. [14:50] LOL, I don't know where it came from [14:50] Mrfai: never? [14:51] no change whatsoever... [14:51] so, your machine still hangs in a loop while trying to get the kernel, right? [14:52] not the kernel, the FAI generated config file [14:53] oz_: yep. Never [14:53] just the config file [14:53] yes [14:53] Action: oz_ rereads the log [14:54] It says trying to load : pxelinux.cfg/C0A80145 [14:54] and hangs... [14:54] then trying to load : pxelinux.cfg/C0A8014 [14:54] and so on.. [14:54] and you get this with "get pxelinux.cfg/C0A8014" on the tftp commandline? [14:54] yet in the tftp root, pxelinux.cfg/C0180145 existis [14:55] oz_: I don't think so, since it misses the fai prefix [14:55] nocturn: if your tftpd is started via inetd, you have to add -s to define the root directory of your tftpd [14:55] also check if your clients really asks your server and no any other server [14:55] Mrfai: it is -s /srv/tftp [14:56] Mrfai: it's this server, they are on an isolated network [14:56] I have -s /srv/tftp/fai. Using FAI 3.2.x [14:56] in /srv/tftp, there's a subdir fai [14:56] trying that... [14:57] nocturn: what does ls -l /srv/tftp/fai/pxelinux.cfg report for you? Did you call fai-chboot? [14:58] -rw-r--r-- 1 root root 236 2007-12-06 14:43 C0A80145 [14:58] -rw-r--r-- 1 root root 236 2007-12-04 15:34 C0A80146 [14:58] ok. good [14:58] I did fai-chboot -I [14:58] MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds [14:58] with the nodename [14:59] I'd guess... [14:59] prefix confusion [15:01] I guess so too [15:01] i also wonder about the "filename "fai/pxelinux.0";" thing [15:02] yeah, that might be the problem... [15:03] I think your client already got pxelinux.0, otherwise you would not seen that it tries to download C0A80145. [15:03] Mrfai: Yes, that's right. [15:03] nocturn: where is your pxelinux.0? [15:03] I have this in dhcpd.conf: filename "pxelinux.0"; [15:03] oz_: in /srv/tftp/fai/ [15:03] I'm sure your -s path for tftpd is wrong [15:03] Mrfai: I tried it withthat using -s /srv/tftp/fai [15:04] but then it fails to find pxelinux.0 [15:04] with filename "pxelinux.0"; [15:04] crg (crg@lagoon.freebsd.lublin.pl) left irc: Ping timeout: 480 seconds [15:04] change filename in dhcpd.conf, restart dhcpd [15:05] Mrfai: what is your entire line in inetd.conf for tftp? [15:06] tftp dgram udp wait root /usr/sbin/in.tftpd /usr/sbin/in.tftpd -s /srv/tftp/fai [15:06] same here [15:06] I'm going to try one more boot, but then I'll have to leave for home... [15:06] did you restart dhcpd [15:06] ? [15:06] Thanks very much guyst [15:06] yes [15:09] nocturn (~nocturn@d51A4378A.access.telenet.be) left irc: Quit: leaving [15:25] Bokeh (~blaat@berchem.lorentz.leidenuniv.nl) left irc: Quit: Leaving [15:36] fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving [15:37] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. [15:42] Mrfai, http://www.informatik.uni-koeln.de/fai/ says 3.2.3 released.... [15:56] MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. [15:57] h01ger: changed. Thanks. [15:58] :) [15:59] fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving [16:36] ErKa (keryell@m2.wifi.enstb.org) left irc: Ping timeout: 480 seconds [17:02] Mrfai (~lange@kueppers.informatik.uni-koeln.de) left irc: Quit: leaving [17:10] MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds [17:48] Lin (~igor@200.179.57.57) joined #fai. [18:11] MT (~MT@ppp-82-135-90-63.dynamic.mnet-online.de) joined #fai. [18:19] Lin (~igor@200.179.57.57) left irc: Quit: Ex-Chat [18:20] MT (~MT@ppp-82-135-90-63.dynamic.mnet-online.de) left irc: Ping timeout: 480 seconds [18:24] MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. [20:38] ErKa (keryell@keryell.pck.nerim.net) joined #fai. [22:23] kugg (~kugg@user184.77-105-210.netatonce.net) joined #fai. [22:38] allee (~ach@allee.mpe.mpg.de) left irc: Remote host closed the connection [22:53] alexanderwz (~alexander@karuna.med.harvard.edu) got netsplit. [22:53] lazyb0y_ (~henning@v683.vanager.de) got netsplit. [22:53] blblack (~brandon@wasabi.dtmf.com) got netsplit. [22:53] juri_ (NkDrE20Hwt@volumehost.com) got netsplit. [22:53] kriebly (~moho@wisdom.Stanford.EDU) got netsplit. [22:53] blblack (~brandon@wasabi.dtmf.com) returned to #fai. [22:54] kriebly (~moho@wisdom.Stanford.EDU) returned to #fai. [22:54] lazyb0y_ (~henning@v683.vanager.de) returned to #fai. [22:54] juri_ (NkDrE20Hwt@volumehost.com) returned to #fai. [22:55] alexanderwz (~alexander@karuna.med.harvard.edu) returned to #fai. [23:10] MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds [00:00] --- Fri Dec 7 2007