| ErKa (keryell@keryell.pck.nerim.net) left irc: Ping timeout: 480 seconds |
| kugg (~kugg@user184.77-105-210.netatonce.net) left irc: Quit: goodbye have a nice day |
| torkel (torkel@ip64.degernas.se) left irc: Ping timeout: 480 seconds |
| Barbarossa (~max@rfc2324.org) left irc: Ping timeout: 480 seconds |
| lazyb0y_ (~henning@v683.vanager.de) joined #fai. |
| lazyb0y (~henning@v683.vanager.de) left irc: Read error: Connection reset by peer |
| Lin (~igor@200.179.57.57) got netsplit. |
| alexanderwz (~alexander@karuna.med.harvard.edu) got netsplit. |
| Lin (~igor@200.179.57.57) returned to #fai. |
| alexanderwz (~alexander@karuna.med.harvard.edu) returned to #fai. |
| torkel (torkel@monsun.hpc2n.umu.se) joined #fai. |
| torkel (torkel@monsun.hpc2n.umu.se) left irc: Quit: leaving |
| torkel (torkel@ip64.degernas.se) joined #fai. |
| MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving |
| ErKa (keryell@m2.wifi.enstb.org) joined #fai. |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| Nick change: fai-guy -> fai-gay[slac] |
| allee (~ach@allee.mpe.mpg.de) joined #fai. |
| nocturn (~nocturn@d51A4378A.access.telenet.be) joined #fai. |
| 09:54 nocturn | Hi guys |
| 09:54 nocturn | My make-nfsroot is bailing out with an error: |
| 09:54 nocturn | I: Base system installed successfully. |
| 09:54 nocturn | Aborting |
| 09:54 nocturn | No diversion `any diversion of /sbin/discover-modprobe', none removed |
| 09:55 nocturn | I have this on two machines with Fai from Debian Etch |
| 10:28 MT | nocturn, could you please run make-fai-nfsroot -v |
| 10:28 MT | and paste the logs to paste.debian.net? |
| 10:30 nocturn | Running now... |
| 10:39 nocturn | Done, this is the 32-bit nfsroot |
| 10:39 nocturn | http://paste.debian.net/44281 |
| 10:40 MT | seems like debootstrap is failing |
| 10:40 MT | try the failing command yourself |
| 10:41 MT | chroot /export/fai/nfsroot-i386 dpkg --force-depends --install var/cache/apt/archives/libc6_2.3.6.ds1-13etch2_i386.deb |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds |
| 10:49 nocturn | MT: it failed |
| 10:49 nocturn | http://paste.debian.net/44284 |
| 10:50 MT | hmm, seems like the permissions of /dev/null are f** up |
| 10:50 nocturn | Strange... what could have caused this? |
| 10:51 nocturn | The nfsroot was empty before starting it |
| 10:51 MT | did you play any tricks on udev? |
| 10:51 MT | I guess udev should take care of such things ... |
| 10:51 nocturn | no, it"s a clean debian install... |
| 10:51 nocturn | oh wait... |
| 10:52 nocturn | I'm using libnss-ldap... |
| 10:52 nocturn | Which causes udev problems on Ubuntu, so maybe Debian etch suffers from this too... |
| 10:52 MT | hmm, I |
| 10:52 MT | I'm using libnss-ldap as well, without any problems |
| 10:52 nocturn | But which /dev/null is the problem, the one in the chroot or in the real fs. |
| 10:53 nocturn | MT: is your FAI server also the LDAP server? |
| 10:53 MT | no |
| 10:53 nocturn | for me it is. |
| 10:54 MT | just compare the output of ls -la /dev/null and ls -la /export/fai/nfsroot-i386/dev/null |
| 10:54 MT | you are using FAI 3.1.8, aren't you |
| 10:54 nocturn | permissions are identical |
| 10:55 nocturn | Fai 3.1.8 (from Debian), yes |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| 10:57 MT | that is, crw-rw-rw, root:root ? |
| 10:57 nocturn | crw-rw-rw- 1 root root 1, 3 |
| 10:58 MT | and you are running all this as user root, aren't you? |
| 10:58 nocturn | yes |
| 10:58 MT | is the root user also managed by LDAP, or is it the usual passwd |
| 10:59 nocturn | no, the root user is local |
| 10:59 MT | ok, try this |
| 10:59 MT | chroot /export/fai/nfsroot-i386 "echo x > /dev/null" |
| 11:00 nocturn | it says: chroot: cannot run command `echo x > /dev/null': No such file or directory |
| 11:01 MT | hmm, maybe the quotes aren't appropriate |
| 11:01 MT | use this |
| 11:01 nocturn | I also tried |
| 11:01 nocturn | chroot /export... |
| 11:01 MT | chroot /export/fai/nfsroot-i386 |
| 11:01 MT | echo x > /dev/null |
| 11:01 MT | exit |
| 11:01 nocturn | then in the chroot: echo x > |
| 11:01 nocturn | bash: /dev/null: Permission denied |
| 11:01 nocturn | ls -la /dev/null |
| 11:01 nocturn | crw-rw-rw- 1 root root 1, 3 May 21 2007 /dev/null |
| 11:01 MT | id |
| 11:01 MT | ? |
| 11:02 nocturn | uid=0(root) gid=0(root) groups=0(root) |
| 11:02 MT | ls -lan /dev/null |
| 11:02 nocturn | crw-rw-rw- 1 0 0 1, 3 May 21 2007 /dev/null |
| 11:04 MT | exit |
| 11:04 MT | mount |
| 11:04 MT | pls ... |
| 11:04 MT | could you paste that? |
| 11:05 nocturn | http://paste.debian.net/44286 |
| 11:05 nocturn | Thanks for helping me MT |
| 11:05 MT | found it? |
| 11:05 MT | it's the nodev ... |
| 11:06 nocturn | OK |
| 11:06 nocturn | I totally forgot about that |
| 11:06 MT | I guess nosuid is also undesireable on that one |
| 11:07 nocturn | Trying again... |
| 11:09 nocturn | It will run for a while, I'm popping out for lunch while it does. |
| Action: nocturn crosses fingers |
| MT (~MT@dove.informatik.tu-muenchen.de) left irc: Remote host closed the connection |
| MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| 12:54 nocturn | MT: It worked, the 32-bit was created correctly |
| 12:54 nocturn | Now also trying 64-bit |
| 12:54 MT | ok, cool! |
| 12:59 nocturn | On 64-bit, it stays on I: Unpacking libgdbm3... for a very, very long time... |
| 13:02 nocturn | I think it's dead... |
| 13:02 nocturn | Is there any way I can check if it's still running? |
| 13:02 MT | top? |
| 13:02 MT | ps? |
| 13:03 nocturn | top says 100% idle |
| 13:03 nocturn | I think NFS is hanging.. |
| Mrfai (~lange@kueppers.informatik.uni-koeln.de) joined #fai. |
| 13:04 nocturn | Ynfs: server fermi2 not responding, still trying |
| 13:04 nocturn | sure... |
| 13:26 nocturn | Ok, copied it to a local dir and amd64 was built correctly |
| 13:26 nocturn | Next step, trying to actually boot a node.. |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Ping timeout: 480 seconds |
| 14:03 nocturn | PXE boot doesn"t work |
| 14:04 nocturn | it says: Trying to load pxelinux.cfg/C0A80145 |
| 14:04 nocturn | Then Trying to load pxelinux.cfg/C0A8014 |
| 14:04 nocturn | and so on. |
| 14:04 nocturn | I checked on the server |
| 14:04 nocturn | pxelinux.cfg/C0A80145 is there |
| 14:04 MT | did you check the logs on the server? |
| 14:06 nocturn | syslog? |
| 14:06 MT | maybe daemon.log |
| 14:06 MT | you might need to enable -vv |
| 14:06 MT | to the tftpd flags |
| 14:07 nocturn | trying... |
| 14:13 nocturn | only this: in.tftpd[22260]: tftp: client does not accept options |
| 14:13 nocturn | in inetd: |
| 14:13 nocturn | tftp dgram udp wait root /usr/sbin/in.tftpd in.tftpd -vv -s /srv/tftp |
| 14:13 nocturn | is that correct? |
| 14:14 MT | seems so |
| 14:14 MT | try -vvv |
| 14:14 MT | and restart inetd, I think |
| 14:14 MT | (assuming that you use the version from inetd at all |
| 14:16 nocturn | yes, I'm using inetd |
| 14:16 nocturn | No change... |
| 14:16 nocturn | only the line I posted above... |
| 14:17 MT | nothing in syslog or daemon.log? |
| 14:17 MT | are you using tftpd-hpa? |
| 14:18 nocturn | yes |
| 14:18 nocturn | It was pulled in as a dependency of Fai-quickstart |
| 14:18 MT | ok, that's fine then |
| 14:19 oz_ | nocturn: did you check tftp? |
| 14:19 nocturn | oz_: what should I check? |
| 14:19 nocturn | tftp is running, it gets a connection from the node |
| 14:19 nocturn | but the node does not seem to find the FAI config file |
| 14:19 MT | yes, but you could use a tftp client to check whether the files are there |
| 14:20 MT | I think this is what oz_ suggested |
| 14:20 oz_ | yes, exactly |
| 14:20 oz_ | tftp localhost |
| 14:20 MT | anyway, here I'm seeing further messages in /var/log/daemon.log |
| 14:20 oz_ | the like on any other ftp client "get <filename> |
| 14:20 oz_ | " |
| Lin (~igor@200.179.57.57) left irc: Quit: Ex-Chat |
| 14:21 nocturn | tftp localhost connects |
| 14:21 oz_ | can you get the kernel, the config files? |
| 14:23 nocturn | I can get the config file |
| 14:23 nocturn | Though this action does not cause more logging than before |
| 14:24 nocturn | Maybe my dhcpd.conf file is missing some option? |
| 14:24 nocturn | subnet declaration has: server-name "fermi2"; filename "fai/pxelinux.0"; |
| 14:24 nocturn | } |
| 14:24 oz_ | nocturn: the "client does not accept options" make me think that it could be that you have two tftpds installed... |
| 14:25 oz_ | and that you are using the wrong one right now |
| 14:25 nocturn | oz_: my inetd has this line |
| 14:25 oz_ | what does "dpkg -l | grep tftp" say? |
| 14:25 nocturn | tftp dgram udp wait root /usr/sbin/in.tftpd in.tftpd -vvv -s /srv/tftp |
| 14:25 nocturn | ii tftp-hpa 0.43-1.1 HPA's tftp client |
| 14:25 nocturn | ii tftpd-hpa 0.43-1.1 HPA's tftp server |
| 14:25 oz_ | nothing more |
| 14:25 oz_ | ? |
| 14:25 nocturn | nope |
| 14:26 nocturn | oz_: does the inetd line look correct? |
| 14:26 oz_ | nocturn: please wait, I'll check it in a minute |
| 14:27 oz_ | what's your output of "dpkg -S /usr/sbin/in.tftpd"? |
| 14:27 oz_ | nocturn: looks quite okay |
| 14:28 oz_ | but I'd run it as a daemon, just for testing |
| 14:31 nocturn | tftpd-hpa: /usr/sbin/in.tftpd |
| 14:32 nocturn | oz_: just run /usr/sbin/in.tftpd -vvv -s /srv/tftp |
| 14:33 oz_ | hm. weird looks good. |
| 14:33 oz_ | still not working? |
| 14:33 oz_ | any other error messages (e.g. in daemon.log) |
| 14:33 oz_ | ? |
| 14:33 nocturn | I started the damon standalone (killed inetd) |
| 14:33 nocturn | waiting for the node to boot |
| 14:33 nocturn | takes a while (Dell firmware...) |
| 14:34 oz_ | do you haxewhat does " dpkg -l | grep pxe" tell you? |
| 14:35 oz_ | s/do\ you\ haxe// , sorry. |
| 14:36 nocturn | dpkg -l | grep pxe |
| 14:36 nocturn | nothing |
| 14:36 nocturn | The only message in daemon.log or syslog was about the client not accepting options |
| 14:37 oz_ | then install pxe |
| 14:37 oz_ | apt-get install pxe |
| 14:37 nocturn | done |
| 14:37 oz_ | that _should_ help. |
| 14:37 nocturn | Do I need to put it anywhere (like inetd.conf)? |
| 14:37 oz_ | ps aux | grep pxe shows a process? |
| 14:38 oz_ | nope |
| 14:38 oz_ | just install |
| 14:38 nocturn | 110 22440 0.0 0.0 3256 1004 ? S 15:37 0:00 /usr/sbin/pxe |
| 14:38 nocturn | Ok, rebooting the node |
| Nick change: MT -> Guest440 |
| Guest440 (~MT@dove.informatik.tu-muenchen.de) left irc: Read error: Connection reset by peer |
| MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. |
| 14:42 nocturn | no change.... |
| 14:43 oz_ | /var/log/pxe.log does exist? |
| 14:43 nocturn | tftp must be working, the node reports tftp prefix: fai |
| 14:43 nocturn | oz_: yes |
| 14:43 oz_ | what does it say? |
| 14:43 nocturn | contains: |
| 14:43 nocturn | Thu Dec 6 15:37:25 2007: Info: Sock::Open: Bound to address: 192.168.1.1 Port: 4011 |
| 14:43 nocturn | Thu Dec 6 15:37:25 2007: Info: Sock::JoinMulticast: Joined multicast group |
| 14:43 nocturn | nothing more |
| 14:44 oz_ | Mon Dec 3 16:33:51 2007: Info: Sock::Open: Bound to address: 192.168.1.50 Port: 4011 <- nothing like that? |
| 14:44 oz_ | maybe... |
| 14:44 nocturn | no |
| 14:44 oz_ | don't suspect me for the next sentence I say. |
| 14:44 oz_ | maybe you reboot the server. ;) |
| 14:44 nocturn | LOL |
| 14:45 nocturn | Tried that a couple of times already |
| 14:45 nocturn | Doesn't the error indicate that it can't find the file over tftp? |
| 14:45 oz_ | after pxe installation? |
| 14:45 nocturn | It reports PXELINUX when booting |
| 14:45 nocturn | ah, rebooting after pxe? No... |
| 14:45 nocturn | Will try that. |
| 14:46 oz_ | :-> |
| 14:46 oz_ | just to be sure that everything is started a new |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| 14:46 oz_ | I am not really sure what this pxething does, never traced it. |
| 14:47 nocturn | I really don't understand it.... It should work |
| 14:47 nocturn | it sees tftp, has the right prefix, so why does it not find the config files... |
| 14:47 oz_ | yes, as far as I can see now, it should. |
| 14:48 oz_ | nocturn: can you get the config files by tftp? |
| 14:48 MT | what about the permissions? |
| 14:48 oz_ | and...where did you set this prefix? |
| Action: oz_ never set any prefix |
| 14:48 MT | at last, you might want to try tcpdump |
| 14:48 MT | after all, it's nice plain-text thing ... |
| 14:48 nocturn | oz_: yes, I could get them manually |
| 14:49 nocturn | master is rebooted, booting node |
| 14:49 Mrfai | I never used the pxe package. |
| 14:49 nocturn | I never set the prefix, but I guess it gets it from the dhcpd.conf |
| 14:49 nocturn | filename "fai/pxelinux.0"; |
| 14:49 oz_ | guessing is bad with computers... |
| 14:49 oz_ | one shoudl know...that's why I hate this "reboot" stuff. |
| 14:50 nocturn | LOL, I don't know where it came from |
| 14:50 oz_ | Mrfai: never? |
| 14:51 nocturn | no change whatsoever... |
| 14:51 oz_ | so, your machine still hangs in a loop while trying to get the kernel, right? |
| 14:52 nocturn | not the kernel, the FAI generated config file |
| 14:53 Mrfai | oz_: yep. Never |
| 14:53 oz_ | just the config file |
| 14:53 nocturn | yes |
| Action: oz_ rereads the log |
| 14:54 nocturn | It says trying to load : pxelinux.cfg/C0A80145 |
| 14:54 oz_ | and hangs... |
| 14:54 nocturn | then trying to load : pxelinux.cfg/C0A8014 |
| 14:54 nocturn | and so on.. |
| 14:54 oz_ | and you get this with "get pxelinux.cfg/C0A8014" on the tftp commandline? |
| 14:54 nocturn | yet in the tftp root, pxelinux.cfg/C0180145 existis |
| 14:55 nocturn | oz_: I don't think so, since it misses the fai prefix |
| 14:55 Mrfai | nocturn: if your tftpd is started via inetd, you have to add -s to define the root directory of your tftpd |
| 14:55 Mrfai | also check if your clients really asks your server and no any other server |
| 14:55 nocturn | Mrfai: it is -s /srv/tftp |
| 14:56 nocturn | Mrfai: it's this server, they are on an isolated network |
| 14:56 Mrfai | I have -s /srv/tftp/fai. Using FAI 3.2.x |
| 14:56 nocturn | in /srv/tftp, there's a subdir fai |
| 14:56 nocturn | trying that... |
| 14:57 Mrfai | nocturn: what does ls -l /srv/tftp/fai/pxelinux.cfg report for you? Did you call fai-chboot? |
| 14:58 nocturn | -rw-r--r-- 1 root root 236 2007-12-06 14:43 C0A80145 |
| 14:58 nocturn | -rw-r--r-- 1 root root 236 2007-12-04 15:34 C0A80146 |
| 14:58 Mrfai | ok. good |
| 14:58 nocturn | I did fai-chboot -I |
| MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds |
| 14:58 nocturn | with the nodename |
| 14:59 oz_ | I'd guess... |
| 14:59 oz_ | prefix confusion |
| 15:01 nocturn | I guess so too |
| 15:01 oz_ | i also wonder about the "filename "fai/pxelinux.0";" thing |
| 15:02 nocturn | yeah, that might be the problem... |
| 15:03 Mrfai | I think your client already got pxelinux.0, otherwise you would not seen that it tries to download C0A80145. |
| 15:03 nocturn | Mrfai: Yes, that's right. |
| 15:03 oz_ | nocturn: where is your pxelinux.0? |
| 15:03 Mrfai | I have this in dhcpd.conf: filename "pxelinux.0"; |
| 15:03 nocturn | oz_: in /srv/tftp/fai/ |
| 15:03 Mrfai | I'm sure your -s path for tftpd is wrong |
| 15:03 nocturn | Mrfai: I tried it withthat using -s /srv/tftp/fai |
| 15:04 nocturn | but then it fails to find pxelinux.0 |
| 15:04 nocturn | with filename "pxelinux.0"; |
| crg (crg@lagoon.freebsd.lublin.pl) left irc: Ping timeout: 480 seconds |
| 15:04 Mrfai | change filename in dhcpd.conf, restart dhcpd |
| 15:05 nocturn | Mrfai: what is your entire line in inetd.conf for tftp? |
| 15:06 Mrfai | tftp dgram udp wait root /usr/sbin/in.tftpd /usr/sbin/in.tftpd -s /srv/tftp/fai |
| 15:06 nocturn | same here |
| 15:06 nocturn | I'm going to try one more boot, but then I'll have to leave for home... |
| 15:06 Mrfai | did you restart dhcpd |
| 15:06 Mrfai | ? |
| 15:06 nocturn | Thanks very much guyst |
| 15:06 nocturn | yes |
| nocturn (~nocturn@d51A4378A.access.telenet.be) left irc: Quit: leaving |
| Bokeh (~blaat@berchem.lorentz.leidenuniv.nl) left irc: Quit: Leaving |
| fai-gay[slac] (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) joined #fai. |
| 15:42 h01ger | Mrfai, http://www.informatik.uni-koeln.de/fai/ says 3.2.3 released.... |
| MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. |
| 15:57 Mrfai | h01ger: changed. Thanks. |
| 15:58 h01ger | :) |
| fai-guy (~fai-guy@213-23-4-247.accor-hotels.arcor-ip.net) left irc: Quit: leaving |
| ErKa (keryell@m2.wifi.enstb.org) left irc: Ping timeout: 480 seconds |
| Mrfai (~lange@kueppers.informatik.uni-koeln.de) left irc: Quit: leaving |
| MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds |
| Lin (~igor@200.179.57.57) joined #fai. |
| MT (~MT@ppp-82-135-90-63.dynamic.mnet-online.de) joined #fai. |
| Lin (~igor@200.179.57.57) left irc: Quit: Ex-Chat |
| MT (~MT@ppp-82-135-90-63.dynamic.mnet-online.de) left irc: Ping timeout: 480 seconds |
| MT (~MT@dove.informatik.tu-muenchen.de) joined #fai. |
| ErKa (keryell@keryell.pck.nerim.net) joined #fai. |
| kugg (~kugg@user184.77-105-210.netatonce.net) joined #fai. |
| allee (~ach@allee.mpe.mpg.de) left irc: Remote host closed the connection |
| alexanderwz (~alexander@karuna.med.harvard.edu) got netsplit. |
| lazyb0y_ (~henning@v683.vanager.de) got netsplit. |
| blblack (~brandon@wasabi.dtmf.com) got netsplit. |
| juri_ (NkDrE20Hwt@volumehost.com) got netsplit. |
| kriebly (~moho@wisdom.Stanford.EDU) got netsplit. |
| blblack (~brandon@wasabi.dtmf.com) returned to #fai. |
| kriebly (~moho@wisdom.Stanford.EDU) returned to #fai. |
| lazyb0y_ (~henning@v683.vanager.de) returned to #fai. |
| juri_ (NkDrE20Hwt@volumehost.com) returned to #fai. |
| alexanderwz (~alexander@karuna.med.harvard.edu) returned to #fai. |
| MT (~MT@dove.informatik.tu-muenchen.de) left irc: Ping timeout: 480 seconds |
| --- Fri Dec 7 2007 |