LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   cifs mounts problem (Win7 via Fedora 14) - system isn't recognizing perfectly valid directories (https://www.linuxquestions.org/questions/linux-server-73/cifs-mounts-problem-win7-via-fedora-14-system-isnt-recognizing-perfectly-valid-directories-845216/)

punt 11-18-2010 11:05 PM

cifs mounts problem (Win7 via Fedora 14) - system isn't recognizing perfectly valid directories
 
UPDATE 11/24: This post had its stay on the Programming forum but doesn't quite fit there since it's an overall mount problem. In a nutshell, the coding of some script is having problems reading some cifs mounts and I don't know why. That's where I'm stuck right now. I don't code, but if the problem is more mount-specific, the diagnosis needs to happen in Linux - Server since I am not able to fix perfectly working code. ;)

I'm running a piece of code that adds directories to a searchable log file.

The executable runs

Code:

if (S_ISDIR(st.st_mode))
to log the directory. If it's not a directory, it doesn't add to the log.

I have a few mounts on cifs (not all) whose legitimate directories are failing this check.

I've actually spent about 4 days on this already to no avail. I have NO idea what is possibly different with the setup. The permissions are fine. I've rebooted. fstab shows identical code (just different drive letters and mount destinations).

The strange thing is that sometimes these directories *are* seen as directories. Sometimes they're not. NOTHING I am physically doing changes this. I could run the code in one minute and it will fail, and then run the code a few minutes later and it will pass.

I've added the following snippet to the code to figure out what's going on:

Code:

printf("0x%x, %d, %lu, 0%o, %d, %d, %d, 0x%x, %d, %ld, %ld, %ld, %ld, %ld, %ld\n",
 st.st_dev, st.st_ino, st.st_mode, st.st_nlink, st.st_uid, st.st_gid, st.st_rdev,
 st.st_size, st.st_blksize, st.st_blocks, st.st_atime, st.st_mtime, st.st_ctime);

Here's the output on a "bad" directory:

Code:

0x18, 0, 180875, 00, 1, 11434240, 0, 0x0, 524288, 65536, 13078516, 13070820, -1074475084, 0, 134524100
Here's the output on the same exact directory when it is "good":

Code:

0x18, 0, 370935, 040777, 0, 0, 0, 0x0, 0, 0, 16384, 0, 1289755732, 1258475515, 1258475515
In other words, some directories are changing their characteristics. Others are not showing good output at all and the directories are not being logged.

I know this is seriously confusing and I'll likely find nothing from this, but I'm totally at a loss here and I figured this is the only place I can go to help. I have no knowledge whatsoever in C and got what I know thus far by asking a few people.

Thanks in advance.

Valery Reznic 11-19-2010 12:32 AM

Could you show code that actually do stat system call?

And may be a bit more around?

GrapefruiTgirl 11-19-2010 05:03 AM

Moved: This thread is more suitable in <Programming> and has been moved accordingly to help your thread/question get the exposure it deserves.

punt 11-19-2010 08:00 AM

(Thanks for moving -- I wasn't sure where to put it. Not sure it fit in Server but it is cifs-related so there's a server element.)

stat doesn't show anything abnormal except size "0" whereas it should be 8192 or 4096, right?

Here's a "bad" dir:

Code:

# stat /mounted/dir1
  File: `/mounted/dir1'
  Size: 0              Blocks: 0          IO Block: 16384  directory
Device: 19h/25d Inode: 562949953543995  Links: 1
Access: (0777/drwxrwxrwx)  Uid: (    0/    root)  Gid: (    0/    root)
Access: 2010-11-17 02:51:07.305175500 -0500
Modify: 2010-11-17 02:51:07.305175500 -0500
Change: 2010-11-17 02:51:07.305175500 -0500

Here's a "good" dir:

Code:

# stat dir2
  File: `dir2'
  Size: 4096            Blocks: 8          IO Block: 4096  directory
Device: 808h/2056d      Inode: 924896      Links: 20
Access: (0777/drwxrwxrwx)  Uid: (    0/    root)  Gid: (    0/    root)
Access: 2010-11-18 04:42:57.112000001 -0500
Modify: 2010-11-16 20:08:28.324000023 -0500
Change: 2010-11-16 20:08:28.324000023 -0500

Here's a currently good dir which sometimes gives me bad results:

Code:

# stat dir3
  File: `dir3'
  Size: 319488          Blocks: 624        IO Block: 16384  directory
Device: 18h/24d Inode: 562949953463760  Links: 1
Access: (0777/drwxrwxrwx)  Uid: (    0/    root)  Gid: (    0/    root)
Access: 2010-11-16 19:54:51.260627500 -0500
Modify: 2010-11-16 19:54:51.260627500 -0500
Change: 2010-11-16 19:54:51.260627500 -0500

I'm so lost.

punt 11-19-2010 08:17 AM

Here's some stuff from dmesg. I don't know the timestamps:

Code:

[ 6873.655031] CIFS VFS: No response for cmd 50 mid 9065
[ 6883.656034] CIFS VFS: Unexpected lookup error -112
[ 6893.657042] CIFS VFS: Unexpected lookup error -112
[ 6903.659028] CIFS VFS: Unexpected lookup error -112
[ 7023.672030] CIFS VFS: Unexpected lookup error -112
[ 7033.673029] CIFS VFS: Unexpected lookup error -112
[ 7048.118031] CIFS VFS: No response for cmd 114 mid 9066
[ 7049.121036] CIFS VFS: No response for cmd 114 mid 9067
[60795.069525] CIFS VFS: Autodisabling the use of server inode numbers on \\mount\mountname. This server doesn't seem to support them properly. Hardlinks will not be recognized on this mount. Consider mounting with the "noserverino" option to silence this message.

Is this related?

Here is more info from my fstab. First I mount the cifs, and then I mount it to where I want to do the logging.

Code:

//mount/s$              /samba/s              cifs    username=user,password=password,dir_mode=0777      0 0
/samba/s/directory            /mount/dir1  bind    rw,bind,dir_mode=777 0 0


theNbomr 11-19-2010 10:17 AM

I don't want to turn this into a Windows flame, but since you're using CIFS and you mention 'drive letters', I guess that the share is on a Windows host. Can you repeat the test from a different client host, or against different servers &/or shares? I'm just trying to establish whether the problem is in your code, or possibly in the server(s) or the CIFS client code itself.

In your code, if you call stat() repeatedly in a tight loop, is the result consistent? Also, can you show us a code fragment that you use to acquire the struct stat?

--- rod.

punt 11-19-2010 11:18 AM

Quote:

Originally Posted by theNbomr (Post 4164475)
I don't want to turn this into a Windows flame, but since you're using CIFS and you mention 'drive letters', I guess that the share is on a Windows host. Can you repeat the test from a different client host, or against different servers &/or shares? I'm just trying to establish whether the problem is in your code, or possibly in the server(s) or the CIFS client code itself.

In your code, if you call stat() repeatedly in a tight loop, is the result consistent? Also, can you show us a code fragment that you use to acquire the struct stat?

--- rod.

Yes, it's a Win7 box. I'm not sure how to repeat this test elsewhere, and Windows is where I intend to be running this so it needs to work... and it did when I had XP to my awareness. I recently upgraded to Win7.

So my knowledge of C is *extremely* basic (I can sort of understand 4% of the code), and therefore, I don't really now how to call stat repeatedly in a tight loop. :( I added that lengthy printf snippet on recommendation from someone else; I have no idea what it is showing.

Here's the struct stuff:

Code:

struct stat st;
 char temppath[MAXPATHLEN];
 snprintf(temppath, MAXPATHLEN, "%s/%s", nambuf, dn->d_name);
 stat(temppath, &st);


That's it. Then I run the printf command as you've seen earlier in this thread.

paulsm4 11-19-2010 11:26 AM

Hi -

Quote:

Also, can you show us a code fragment that you use to acquire the struct stat?
In particular, do you check the return value?

I'm guessing the "VFS lookup error" is probably key. I'm also guessing that maybe the problem is indeed intermittant (as previously suggested), and maybe the same "stat()" call might "magically work" if you retried it after a slight delay.

I'm sure you've had experiences with Windows desktop and/or Windows explorer windows appearing to "hang". Under the covers, that's Windows retrying it's little heart out because of some network access glitch or another :)

punt 11-19-2010 11:29 AM

Quote:

Originally Posted by paulsm4 (Post 4164546)
Hi -



In particular, do you check the return value?

I'm guessing the "VFS lookup error" is probably key. I'm also guessing that maybe the problem is indeed intermittant (as previously suggested), and maybe the same "stat()" call might "magically work" if you retried it after a slight delay.

I'm sure you've had experiences with Windows desktop and/or Windows explorer windows appearing to "hang". Under the covers, that's Windows retrying it's little heart out because of some network access glitch or another :)

Well, the problem isn't "intermittent" really. To be clear, the problem seems to happen in the first 5-6 hours and then goes away for one mount, but I have *not* at all fixed the other mount.

Yes, I have checked the st.st_mode a few times. It either returned 40777 (valid) or 0 (invalid).

theNbomr 11-19-2010 12:32 PM

Uhm... This IS code running on a Linux host, right? If not, I and probably a few others, have probably been barking up the wrong tree.

You have your code fragment that printf()'s a bunch of stuff. If you wrap that in a for loop, like
Code:

int i;
for( i = 0; i < 100; i++ ){
int ret;
    ret = stat(.....  blah blah blah );
    printf("ret: %d - 0x%x, %d, %08X, 0%o, %d, %d, %d, 0x%x, %d, %ld, %ld, %ld, %ld, %ld, %ld\n",
          ret, st.st_dev, st.st_ino, st.st_mode, st.st_nlink, st.st_uid, st.st_gid, st.st_rdev,
            st.st_size, st.st_blksize, st.st_blocks, st.st_atime, st.st_mtime, st.st_ctime);
}

This should repeat the reading 100 times and barf out the result.

--- rod.

punt 11-19-2010 12:49 PM

Sorry, I'm a bit of a newbie. The code already recursively jumps into every file in that directory. All this will do is repeat the same code 100 times. I usually manually run the script and see the output, but it never changes like that right away.

Here's output of a bad directory now based on your code:
Code:

ret: -1 - 0x19, 0, 00019469, 026735032560, 1, 0, 1, 0xae7900, 0, 524288, 0, 65536, 13078516, -1077470700, 0
And to be even clearer: this is a Fedora 14 box. I am mounting Windows 7 mounts via CIFS.

paulsm4 11-19-2010 01:41 PM

Hi -

Well, there's your answer :):
Quote:

ret: -1 - 0x19, 0, 00019469, 026735032560, 1, 0, 1, 0xae7900, 0, 524288, 0, 65536, 13078516, -1077470700, 0
If you get an error return ("-1"), then all bets off - the rest of the data is garbage.

So now the question is "Why are we getting an error?"

Next stop - you can read specific error codes from "errno".

I'll betcha they correspond to the VFS errors you mentioned above.

Remember - this kind of stuff happens all the time in the Windows environment.

In fact, in ANY network environment. Windows just goes to great lengths to HIDE intermittant "glitches" from the end-user ;)

punt 11-19-2010 02:20 PM

Yeah, those printf statements from earlier confirmed that there are errors recognizing the directory structure, I just don't know why either.

Is there a way for me to get these issues reasonably resolved? Why is this more visible in Win7 and not XP? Is there anything I can do? Mount differently?

Valery Reznic 11-19-2010 02:29 PM

Quote:

Originally Posted by punt (Post 4164551)
Well, the problem isn't "intermittent" really. To be clear, the problem seems to happen in the first 5-6 hours and then goes away for one mount, but I have *not* at all fixed the other mount.

Yes, I have checked the st.st_mode a few times. It either returned 40777 (valid) or 0 (invalid).

If your stat function return -1 (error) then you can't trust whatever returned in the buffer. - Ooops - didn't see that it was already said

punt 11-19-2010 02:50 PM

Right, so my question is now forcing the mount to be correctly recognizable... is this too crazy to ask? :)


All times are GMT -5. The time now is 05:25 PM.