Windows networking problems

Stewart,

On the Windows 7 machine go to "c:\windows\system32\drivers\etc" folder, there is a file called hosts (no extension), open in notepad and add a new entry as:

123.x.x.x machine_name

Save the file, make sure win7 doesn't add a .txt extension to the file as it does some times. Win7 hides the "known extension types" by default, you can change that behaviour under folder properties.

If Win7 is not installed on C: just substitute the drive letter as appropriate.

This will allow you to access the machine by name.
 
Stewart,

On the Windows 7 machine go to "c:\windows\system32\drivers\etc" folder, there is a file called hosts (no extension), open in notepad and add a new entry as:

123.x.x.x machine_name

Save the file, make sure win7 doesn't add a .txt extension to the file as it does some times. Win7 hides the "known extension types" by default, you can change that behaviour under folder properties.

If Win7 is not installed on C: just substitute the drive letter as appropriate.

This will allow you to access the machine by name.

as said earlier that skirts around the issue if the ip changes for any reason.
 
This will allow you to access the machine by name.
That's very true, but is a hack that shouldn't be needed and gets complex when you have multiple computers AND DHCP involved. This is a solveable problem. A couple of hours now will mean things work much more smoothly in the future.
 
If the server is static, that would explain the ping by name, but not the ping by IP addy
 
its certainly worth a shot. if it needs to be on a particular then reserve it on DHCP.
Yeah. A router config problem is potentially an area where this could be an issue. It's clear (to me) that we need to get all the machiunes in the same actual workgroup first.

Stewart: you have my number - if you want any help over the weekend, just give me a ring (I should be building websites so...)
 
If the server is static, that would explain the ping by name, but not the ping by IP addy
No. The fact it thinks it is in a separate workgroup would explain the ping by add but not by name. The NetBios address resolution is learned - a static IP address (I have several here) resolve perfectly well when the network is configured correctly.
 
as said earlier that skirts around the issue if the ip changes for any reason.
Damn you and your quicker than me typing!! :D
 
Does the router or any other device on the network offer DNS service? Not the DNS proxy on the router as that is only going to act as a cache for the ISPs DNS.

A "servers" IP shouldn't change, DHCP can be setup to assign a specific IP to the MAC address.

Short of using a DNS server service or the hosts file, I can't understand how you're going to resolve a name to it's IP address. (Genuine question, I'm not being sarcastic).
 
Short of using a DNS server service or the hosts file, I can't understand how you're going to resolve a name to it's IP address. (Genuine question, I'm not being sarcastic).
Windows networking builds a dynamic list of Netbios names to IP addresses. It's not particularly well documented, but if the network is setup correctly it will work.

At the moment, Stewarts network is split in two as far as Netbios is concerned which is making resolution across the boundaries dependent on which OS is in use.
 
Wow. What a day.

First of all, MASSIVE thanks to Andy (arad85) who's devoted a huge amount of time and effort to talking me through this. I think we have a solution now, though I haven't yet re-run all the diagnostics to prove it.

The problem appeared to be a Master Browser conflict. As I (vaguely) understand it, this is basically a spat between different machines on the network as to who's in charge of telling the others where things are. Microsoft machines generally tend to defer to the newer OSs in these matters, so the two XP machines and the two Win 7 machines were all happy for one of the Win 7 machines to be the MB. But the server wanted to be the MB because (a) it's a server, and (b) it's running WHS 2003 and it doesn't understand Win 7. So there were effectively two parallel distinct descriptions of the network - the one used by the server, which contained only the server; and the one used by everything else, which contained everything but the server.

We dug into the administrative tools on every PC and stopped the 'Computer Browser' service. The idea was that with only the server having this service running, all the rest would defer to it and accept it as the MB. However, that didn't work. The XP machine panicked and asked one of the Win 7 machines to take on the MB role; apparently it thought that was preferable to allowing the server to do it, despite the fact that the server was up for it and the Win 7 machine wasn't.

So back into administrative tools, and this time we disabled the 'Computer Browser' service on all machines except the server. That gave them all no choice but to defer to the server. Power everything down, power up the server, let it get settled down, and then power everything else up. And it seems to work.

At least my original problem seems to be solved. From my Win 7 machine I can now access the server by name ... which means I can map a network drive to it by name ... which means I can designate folders on the server as trusted locations ... which means I can access and run all my VBA code, stored queries etc ... which means I can get on with hiring out lenses to people.

As I said, I haven't yet proved the solution by re-running all the diagnostics, but I will.

I also have a couple of potential problems outstanding.

  • Firstly, will the laptop be happy being told that it's not allowed to be a MB, even when it's not on the office network?
  • Secondly, what happens if/when the server has to be re-booted? All the other machines in the office will be temporarily left without a MB, and I don't know what effect that will have.

But I think these are (hopefully) small issues.

Thanks again to Andy, and to everybody else who contributed.
 
Stewart: can you post a ipconfig /all from the server?

Also, which router is it?
The router is a Linksys WAG354G. Just a basic wireless modem/router.

Here's the ipconfig /all output:

C:\Documents and Settings\Administrator>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : LFH-SERVER
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Unknown
IP Routing Enabled. . . . . . . . : Yes
WINS Proxy Enabled. . . . . . . . : Yes

Ethernet adapter Hamachi:

Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Hamachi Network Interface
Physical Address. . . . . . . . . : 7A-79-05-4C-E7-61
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : No
IP Address. . . . . . . . . . . . : 5.76.231.97
Subnet Mask . . . . . . . . . . . : 255.0.0.0
Default Gateway . . . . . . . . . :
DHCP Server . . . . . . . . . . . : 5.0.0.1
Lease Obtained. . . . . . . . . . : 21 October 2011 15:33:57
Lease Expires . . . . . . . . . . : 20 October 2012 15:33:57

Ethernet adapter Local Area Connection:

Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Realtek RTL8168C(P)/8111C(P) PCI-E Gigabi
t Ethernet NIC
Physical Address. . . . . . . . . : 00-1C-C0-C5-A6-CD
DHCP Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 10.0.4.210
Subnet Mask . . . . . . . . . . . : 255.0.0.0
Default Gateway . . . . . . . . . : 10.0.4.254
DNS Servers . . . . . . . . . . . : 10.0.4.254

C:\Documents and Settings\Administrator>

Interestingly I had completely forgotten that the server has a Hamachi VPN installed on it. That was something we did ages ago when I had a contractor doing some stuff for me and he needed remote access. I probably ought to uninstall it. But it's not germane to the issues we've been having today.
 
First of all, MASSIVE thanks to Andy (arad85) who's devoted a huge amount of time and effort to talking me through this.
No problems :)

I think we have a solution now, though I haven't yet re-run all the diagnostics to prove it.
That's excellent news! Will be good to check it.

As I (vaguely) understand it,
It's how I understand it too so ;) (y)

I also have a couple of potential problems outstanding.

  • Firstly, will the laptop be happy being told that it's not allowed to be a MB, even when it's not on the office network?
  • Secondly, what happens if/when the server has to be re-booted? All the other machines in the office will be temporarily left without a MB, and I don't know what effect that will have.
Laptop: I don't think it will matter, but as it's an XP machine, you can probably turn the browser on on that machine with no affect. If you have problems, it may be better to turn the browser on and then control it through the registry settings.

Reboot: if the machines are accessing each other, then the IP address will be in the cache, so nothing will happen. If the machines haven't asked each other for data, I think they will ask using NetBios as they know of the other computers through the cached version of the table.

As an aside, network accesses should get quicker to the server as the network is now in a "good" state.
 
Just an update.

It appears it wasn't directly related to Browser Master (some of my assertions above were incorrect), but it was a problem with configuration. Playing with some commands on my (Unix) server, showed that the NetBios lookup was done using a combination of the IP address and the subnet mask. If the subnet mask isn't correct, the broadcast address isn't the one that machines are looking for.

Looking back at the ipconfig /all dumps from Stewart shows that the statically allocated IP for the server has a subnet mask of 255.0.0.0 whilst those for the dynamically allocated have subnet masks of 255.255.255.0. The reason this makes a difference is easily shown by example:
My IP addresses are 192.168.1.xxx, my subnet mask is 255.255.255.0. What it is does in NetBios is broadcast a request to an address that is a combination of the IP address and the subnet mask. It creates the broadcast address by using the number from the IP address if the number is 255 in the subnet mask and if the subnet mask is 0, it uses 255. So:

10.0.4.210 with subnet mask 255.0.0.0 will be broadcasting/responding on an address 10.255.255.255
which is different to the other machines as:
10.0.4.18 with subnet mask 255.255.255.0 will be broadcasting/responding on an address 10.0.4.255

Crucially (and this is the bit that appears to have fixed it),
10.0.4.210 with subnet mask 255.255.255.0 will be broadcasting/responding on an address 10.0.4.255 which is the same as all the other machines. Once Stewart did this, the server jumped into the same workgroup as the other machines and everything was shared nicely between machines.

He was going to test this in more detail tonight, but I think this is now fixed. Any more news Stewart?
 
Looking back at the ipconfig /all dumps from Stewart shows that the statically allocated IP for the server has a subnet mask of 255.0.0.0 whilst those for the dynamically allocated have subnet masks of 255.255.255.0.

:bang:

this is what i mean when i say its so much harder diagnosing issues when not being sat infront of the box(es).. so much easier to miss something.
 
Don't think I'd have seen it even then. Needed to understand how the address lookup happened (via a broadcast) before it clicked. The fact we ended up with two master browsers is just an effect of the problem. I'll know for the next time tho :geek:
 
He was going to test this in more detail tonight, but I think this is now fixed. Any more news Stewart?
I've sent you a long email Andy, with details of all the tests I ran, but the short answer is yes, it all seems OK now.

A picture is worth a thousand words, as they say:

Network.PNG


That seems like a major result. All the machines (WHS 2003 server, XP laptop, XP desktop, W7 desktops, even the Virtual XP machine) can all see one another and can all share files etc without any drama.

But I'd appreciate your confirmation that the diagnostics I've sent you are OK.
 
Last edited:
Just replied to your mail Stewart. Yes, it looks to be working properly now :) All that hassle just because the server had a couple of 0s instead of 255s in a setting somewhere. Bloomin software!
 
This is a very long thread, which I've only scanned quickly. Pleased to see the problem has been sorted, but I couldn't resist adding my twopennorth - for what it's worth. But feel free to ignore it :)

Another resolution would have been to ensure the Win 7 computers had Network Discovery turned on and HomeGroups turned off. (I think that was covered early on.) In Windows Firewall ensure File and Printer Sharing is allowed, assuming you have a decent router, which will act as the firewall from outside the network.

Change the name of the Workgroup on every computer, starting with the Server, to anything other than an existing Workgroup name, making sure every computer has exactly the same Workgroup name.

Open a Command Prompt on each computer and type (without the quotes) "ipconfig /flushdns" as the original issue was essentially a DNS problem and this will get rid of the DNS cache in each computer so it is forced to start again.

Then rebuild the network on each computer by opening a command prompt and typing (without the quotes) "netsh int ip reset c:\resetlog.txt"

Edit: (If the Server had a static IP Address before then you may wish to add this again. Resetting the IP Protocol would have wiped that out.)

That should have fixed the problem, although I realise it has already been fixed now anyway :)

In simple terms, the Subnet 255.255.255.0 essentially tells a computer to ignore the 1st 3 Octets of an IP Address, and to only worry about the 4th Octet. So, a computer with an address of 10.0.4.x with a Subnet Mask of 255.255.255.0 will assume all other computers are on the same Subnet of 10.0.4.

The Server with the address of 10.0.4.210 with a Subnet Mask of 255.0.0.0 will be looking for any computers with an IP Address of 10.x.x.x

I'm not sure why this has stopped the network from functioning correctly. It would make network discovery much slower, but it should have still worked. However, as neil_g mentioned diagnostic over a text thread is very difficult, especially of a network consisting of so many flavours of Windows.
 
Last edited:
Another resolution would have been to ensure the Win 7 computers had Network Discovery turned on and HomeGroups turned off. (I think that was covered early on.)
Does Network Discovery work across different subnet networks? If you subnet two networks into non overlapping spaces, you should have two separate networks that can't see each other, even though they are on the same piece of cable. I'd have expected the same for hierarchical subnets in a workgroup only environment (i.e. where you don't have Domain Controllers).

Open a Command Prompt on each computer and type (without the quotes) "ipconfig /flushdns" as the original issue was essentially a DNS problem and this will get rid of the DNS cache in each computer so it is forced to start again.
No it wasn't. It was a NetBios issue. The NetBios cache is different to the DNS cache. DNS is ONLY used when there is a period in the network name. Accessing a computer as \\COMP-A will use NetBios. Accessing it as (say) \\COMP-A.home will use DNS.

Look at the following command sequences:

Code:
G:\Users\Andy>nbtstat -c

Local Area Connection 2:
Node IpAddress: [192.168.1.24] Scope Id: []

    No names in cache

G:\Users\Andy>ping mainserver.home

Pinging mainserver.home [192.168.1.10] with 32 bytes of data:
Reply from 192.168.1.10: bytes=32 time=1ms TTL=64
...

G:\Users\Andy>nbtstat -c

Local Area Connection 2:
Node IpAddress: [192.168.1.24] Scope Id: []

    No names in cache

G:\Users\Andy>ping mainserver

Pinging mainserver [192.168.1.10] with 32 bytes of data:
Reply from 192.168.1.10: bytes=32 time<1ms TTL=64
...
G:\Users\Andy>nbtstat -c

Local Area Connection 2:
Node IpAddress: [192.168.1.24] Scope Id: []

                  NetBIOS Remote Cache Name Table

        Name              Type       Host Address    Life [sec]
    ------------------------------------------------------------
    MAINSERVER     <00>  UNIQUE          192.168.1.10        597

And the following is also the case:

Code:
G:\Users\Andy>ipconfig /flushdns

Windows IP Configuration

Successfully flushed the DNS Resolver Cache.

G:\Users\Andy>ping mainserver

Pinging mainserver [192.168.1.10] with 32 bytes of data:
Reply from 192.168.1.10: bytes=32 time=1ms TTL=64
... 

G:\Users\Andy>ipconfig /displaydns

Windows IP Configuration

G:\Users\Andy>ping mainserver.home

Pinging mainserver.home [192.168.1.10] with 32 bytes of data:
Reply from 192.168.1.10: bytes=32 time=1ms TTL=64
...

G:\Users\Andy>ipconfig /displaydns

....
    mainserver.home
    ----------------------------------------
    Record Name . . . . . : mainserver.home
    Record Type . . . . . : 1
    Time To Live  . . . . : 86392
    Data Length . . . . . : 4
    Section . . . . . . . : Answer
    A (Host) Record . . . : 192.168.1.10


    Record Name . . . . . : ns1.home
    Record Type . . . . . : 1
    Time To Live  . . . . : 86392
    Data Length . . . . . : 4
    Section . . . . . . . : Additional
    A (Host) Record . . . : 192.168.1.10
....

so there are two different caches, one for NetBios names, one for DNS names.

Then rebuild the network on each computer by opening a command prompt and typing (without the quotes) "netsh int ip reset c:\resetlog.txt"

Edit: (If the Server had a static IP Address before then you may wish to add this again. Resetting the IP Protocol would have wiped that out.)
If it does reset the static IP, this would have fixed the problem as it would pick up the correct subnet mask. It may have caused a different problem - for example if any ports are forwarded from the router to the server. There's normally a reason for static IPs (I like them here as I run a DNS server on one of my machines and it's just easier to deal with rather than having dynamically allocated ones)...

In simple terms, the Subnet 255.255.255.0 essentially tells a computer to ignore the 1st 3 Octets of an IP Address, and to only worry about the 4th Octet. So, a computer with an address of 10.0.4.x with a Subnet Mask of 255.255.255.0 will assume all other computers are on the same Subnet of 10.0.4.

The Server with the address of 10.0.4.210 with a Subnet Mask of 255.0.0.0 will be looking for any computers with an IP Address of 10.x.x.x
I don't think that is quite right. What the subnet mask also does is define the broadcast address. The broadcast address is different between the two different networks described. TBH, I'm surprised it worked at all as you cannot infer that any address OTHER than 10.255.255.255 is a broadcast request on the 10.x.x.x network (i.e. 10.0.4.255 isn't a broadcast address on that network, so the server shouldn't respond to it).
 
Thanks for pointing out the NetBios v DNS issues Arad85. I did mention that I'd only scanned the thread quickly and I did suggest my points could be ignored ;)

Flushing the DNS Cache would certainly be a step in any network issue, even if DNS was not the culprit.

Where I have suggested "Another resolution..." it was meant to be one resolution of many steps - not limited to just turning on Network Discovery and turning off HomeGroups, (which had already been tried.)

Changing the name of the Workgroup would probably have helped, as there seems to have been two Workgroups with the same name, being treated as different. This is a common fix for Windows network issues, (similar to leaving and re-joining domains.)

As a final step, rebuilding the IP Stack would probably have solved this issue. The Static IP Address would need to be added back in - if it was there for a reason. Otherwise it's best left out as the Router is runnning the DHCP.
 
Changing the name of the Workgroup would probably have helped, as there seems to have been two Workgroups with the same name, being treated as different.
The reason for them being treated as different was because they are on different networks. One was on a class A network (the 255.0.0.0 subnet) the other was on a subnet of that network (255.255.255.0).
 
Back
Top