Discussion:
[Mingw-users] Linux dirent->d_type ... Windows equivalent?
Boyd, Todd M.
2008-06-04 20:56:09 UTC
Permalink
First off, I apologize if the answer to my question is an "obvious" one.
I have spent an exorbitant amount of time searching through mailing list
archives, forums, FAQs, help files, and websites to no avail.



I have written a program in C and compiled it on my VirtualPC, which is
running Debian. In this program, I am using the "scandir" function and
the "dirent" structure(s) it returns. The "d_type" member of the
"dirent" structure is where I hit a discrepancy between Windows and
Linux. This member is apparently not present in Windows' GNU version of
"dirent.h".



I have seen some confusing posts elsewhere about using the Microsoft
Visual C header file "direct.h" (albeit modified) to accomplish the
cross-over, but I don't even know where to begin as far as what needs to
be removed from the MSVC header in order for it to properly compile
under MinGW's "gcc" compiler. So, I pose two questions:



1.) Is the MSVC "hack" the best (or most viable) solution to my
problem, or is there a pre-configured "dirent.h"/"direct.h" out there
that will accomplish the cross-over?

2.) If the MSVC "hack" is the way to go, what needs to be stripped
from the file (and what other steps need to be taken) for it to compile
properly with MinGW's "gcc"?



Any help you can offer would be greatly appreciated! Again-I have
already written the program. It works (in Linux)... like a charm. I
merely want to port it over to Windows without making any major
modifications to the source, seeing as how the only discrepancy between
the two platforms (with regard to my code) is the absence of
"dirent->d_type" to determine if the "file" in question is actually a
directory. I understand that there are other ways to traverse
directories in Windows, but all of them I have discovered so far would
require a complete overhaul of most of the code involved in my project.



Thanks!





Todd Boyd

Web Programmer





P.S. - For those of you who are curious... the program is a crawler that
will traverse directories through recursive descension and search
specified file extensions for regex matches in their contents.
Tor Lillqvist
2008-06-04 22:29:11 UTC
Permalink
I am using the "scandir" function and the
"dirent" structure(s) it returns. The "d_type" member of the "dirent"
structure is where I hit a discrepancy between Windows and Linux. This
member is apparently not present in Windows' GNU version of "dirent.h".
Actually dirent.h and the implementation of its functions in
libmingwex are not "GNU" in any sense. They are Public Domain, not GPL
or LGPL, and are not developed as part of any GNU project to the best
of my knowledge.

There is no scandir() function in the MS C library or libmingwex
either, so that is also a concern. The lack of d_type can always be
compensated for by calling stat() on each file that readdir() returns.

If you use the MS findfirst()/findnext() functions instead, then the
finddata struct contains the attrib bitmask, and if the the _A_SUBDIR
bit is set, then the file is a directory.
I don't even know where to begin as far as what needs to be removed from the
MSVC header in order for it to properly compile under MinGW's "gcc"
compiler.
Why would you need to do that when mingw already comes with a
direct.h? And in any case, the functionality in the MS C library that
most closely corresponds to that in dirent.h is declared in io.h, not
direct.h. (I mean the findfirst(), findnext() and struct finddata
stuff.)
I understand that there are
other ways to traverse directories in Windows, but all of them I have
discovered so far would require a complete overhaul of most of the code
involved in my project.
As there is no scandir(), you will have to rework your code in any
case. The most portable way (i.e. which works both on Unix and on
Windows with mingw) is to use opendir(), readdir(), and closedir()
instead of scandir(). On Linux you can use the d_type member of struct
dirent to see which directory entry is a directory, on Windows you
would need to stat() each directory entry separately.

But... if you really want to be able to handle all possible files on a
Windows machine you should not use the opendir(), readdir(),
closedir() functions, but their wide character (Unicode as UTF-16)
equivalents _wopendir(), _wreaddir(), _wclosedir() instead. Otherwise
your code will miss out names containing characters not in the system
codepage. Depending on your intended user base, this might be relevant
or not.

(Note that if your application is at all related to issues that might
make users want to circumvent it, for policy abuse or whatever similar
reasons, you definitely should use the wide character variants to
really find all files. Otherwise clever users will soon find out that
if they put their pr0n, warez, or whatever in a folder with Greek
letters in the name, for instance, your program won't find it.)

--tml
Boyd, Todd M.
2008-06-05 13:28:09 UTC
Permalink
-----Original Message-----
Sent: Wednesday, June 04, 2008 5:29 PM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?
I am using the "scandir" function and the
"dirent" structure(s) it returns. The "d_type" member of the
"dirent"
structure is where I hit a discrepancy between Windows and Linux.
This
member is apparently not present in Windows' GNU version of
"dirent.h".
Actually dirent.h and the implementation of its functions in
libmingwex are not "GNU" in any sense. They are Public Domain, not GPL
or LGPL, and are not developed as part of any GNU project to the best
of my knowledge.
...this was a foolhardy assumption on my part. I saw it in MinGW and in
my Linux headers, and so I decided that they were GNU libraries. My
mistake.
There is no scandir() function in the MS C library or libmingwex
either, so that is also a concern. The lack of d_type can always be
compensated for by calling stat() on each file that readdir() returns.
If you use the MS findfirst()/findnext() functions instead, then the
finddata struct contains the attrib bitmask, and if the the _A_SUBDIR
bit is set, then the file is a directory.
After doing a bit more reading, I'm guessing I can use readdir(), stat()
the files readdir() finds, and check "dirent->st_mode | S_IFDIR" to
differentiate files from directories. Does this seem like it would work?
I don't even know where to begin as far as what needs to be removed
from the
MSVC header in order for it to properly compile under MinGW's "gcc"
compiler.
Why would you need to do that when mingw already comes with a
direct.h? And in any case, the functionality in the MS C library that
most closely corresponds to that in dirent.h is declared in io.h, not
direct.h. (I mean the findfirst(), findnext() and struct finddata
stuff.)
I suppose I didn't need to do that. I believe I mentioned something
about reading a series of confusing texts that led me to my conclusion.
I have not been able to find much documentation on "dirent" as it
relates to Windows vs. Linux.
I understand that there are
other ways to traverse directories in Windows, but all of them I have
discovered so far would require a complete overhaul of most of the
code
involved in my project.
As there is no scandir(), you will have to rework your code in any
case. The most portable way (i.e. which works both on Unix and on
Windows with mingw) is to use opendir(), readdir(), and closedir()
instead of scandir(). On Linux you can use the d_type member of struct
dirent to see which directory entry is a directory, on Windows you
would need to stat() each directory entry separately.
As I mentioned above, would the S_IFDIR flag of st_mode be appropriate
for both operating systems? I would like the code to be as universal as
possible.
But... if you really want to be able to handle all possible files on a
Windows machine you should not use the opendir(), readdir(),
closedir() functions, but their wide character (Unicode as UTF-16)
equivalents _wopendir(), _wreaddir(), _wclosedir() instead. Otherwise
your code will miss out names containing characters not in the system
codepage. Depending on your intended user base, this might be relevant
or not.
I remember reading a bit on the wide-character versions of functions in
Windows. None of the file names in question will contain any abnormal
characters (basically just a-z, 0-9, maybe a hyphen or a period every
now and again), but I am nonetheless going to familiarize myself with
using the wide-character versions. I'd like to be comfortable with them
in the future, and I may as well start now.
(Note that if your application is at all related to issues that might
make users want to circumvent it, for policy abuse or whatever similar
reasons, you definitely should use the wide character variants to
really find all files. Otherwise clever users will soon find out that
if they put their pr0n, warez, or whatever in a folder with Greek
letters in the name, for instance, your program won't find it.)
No user file storage going on here. However, it's still a very pertinent
suggestion since security and the omnipotence of my scanning program are
chief concerns.

Thank you for all of your help and suggestions! I am going to try and
rewrite my code--first in Linux, to use the opendir() readdir()
closedir() functions; next in Windows, to test the "_w" equivalents of
these functions. Or have I misunderstood--are they included in the POSIX
specifications as well?

Again, thank you!


Todd Boyd
Web Programmer
Tor Lillqvist
2008-06-05 13:44:17 UTC
Permalink
Post by Boyd, Todd M.
After doing a bit more reading, I'm guessing I can use readdir(), stat()
the files readdir() finds, and check "dirent->st_mode | S_IFDIR" to
differentiate files from directories. Does this seem like it would work?
Yes indeed.
Post by Boyd, Todd M.
I remember reading a bit on the wide-character versions of functions in
Windows. None of the file names in question will contain any abnormal
characters (basically just a-z, 0-9, maybe a hyphen or a period every
now and again),
OK. If you know that for sure (for instance because some program you
have control over generates the files), then it should be fine to use
just the "narrow" versions of the functions, and then be able to use
the same code on Linux and Windows. (The only difference then being
that on Linux you can look at d_type, while on Windows you need to
stat() each file.)
Post by Boyd, Todd M.
but I am nonetheless going to familiarize myself with
using the wide-character versions. I'd like to be comfortable with them
in the future, and I may as well start now.
Yes, that is a good idea.
Post by Boyd, Todd M.
next in Windows, to test the "_w" equivalents of
these functions. Or have I misunderstood--are they included in the POSIX
specifications as well?
Nope. In POSIX file names are just strings of bytes (chars), on all
levels. Any interpretation of the byte strings as ISO-8859-1, UTF-8,
EUC-JP or whatever character set and encoding is up to user level code
and convention at each site.

On Windows, file names as actually stored on disk (when using NTFS)
are strings of wchar_t (16-bit Unicode characters). The Windows kernel
below the Win32 layer uses just these wchar_t file names, as far as I
know. (Actually some rare characters might take two wchar_t units, a
so-called surrogate pair, but the Win32 API or kernel layers doesn't
"know" that, as far as I know, so making sure surrogate pairs are not
broken is up to user level code.)

--tml
Boyd, Todd M.
2008-06-05 14:43:02 UTC
Permalink
-----Original Message-----
Sent: Thursday, June 05, 2008 8:44 AM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?

---8<--- snip
Post by Boyd, Todd M.
next in Windows, to test the "_w" equivalents of
these functions. Or have I misunderstood--are they included in the
POSIX
Post by Boyd, Todd M.
specifications as well?
Nope. In POSIX file names are just strings of bytes (chars), on all
levels. Any interpretation of the byte strings as ISO-8859-1, UTF-8,
EUC-JP or whatever character set and encoding is up to user level code
and convention at each site.
On Windows, file names as actually stored on disk (when using NTFS)
are strings of wchar_t (16-bit Unicode characters). The Windows kernel
below the Win32 layer uses just these wchar_t file names, as far as I
know. (Actually some rare characters might take two wchar_t units, a
so-called surrogate pair, but the Win32 API or kernel layers doesn't
"know" that, as far as I know, so making sure surrogate pairs are not
broken is up to user level code.)
That's what I thought: everything in POSIX is treated as binary, without
differentiation. If the only difference between my code is going to be
"_wopendir()" vs. "opendir()", etc., then I can live with that. I'll add
compiler directives to make the cross-over easier.

It's been a while since I've had to play with C/C++, and I've run into
what appears to be a final road-block. I've got a dynamically-sized
array of "dirent" structures called "namelist". I want readdir() to
populate this array of structures. Everything works except dynamically
allocating memory for each individual array index. Here's a bit more
info:

[code]

struct dirent **namelist;
int n = 0;
struct dirent * tempEnt;

dirToScan = opendir(curDir);

if(dirToScan == NULL)
{
perror("Error (opendir)");
exit(1);
}

while((tempEnt = readdir(dirToScan)) != NULL)
{
namelist[n] = malloc(sizeof(struct dirent));
namelist[n++] = &tempEnt;
}

if(n <= 0)
{
perror("Error (readdir)");
exit(1);
}

[/code]

This is all contained within the function recursiveScan(), which accepts
a character array parameter "curDir". "dirToScan" has been declared
elsewhere as "DIR * dirToScan".

I've tried using pointer tricks like *(namelist + n) and such, but I
don't remember enough about dereferencing to wrap my brain around this
right now.

I understand I'm veering a bit off-topic here, but any help you can
offer would be greatly appreciated. I'm so close to being finished!
Then, I can work on the Windows port by (hopefully) just replacing the 3
"dir" functions with their wide-character counterparts.

Thanks!!


Todd Boyd
Web Programmer


P.S. - I opted to stat() each file even in the POSIX version for code
universality. It may be a bit more work than just pulling the
dirent->d_type out, but from what I've read, S_IFDIR will work, as well.
Also, I referred to "dirent->st_mode" in my previous message, when I
meant to write "stat.st_mode". My mistake.
Tuomo Latto
2008-06-05 16:01:23 UTC
Permalink
Post by Boyd, Todd M.
namelist[n] = malloc(sizeof(struct dirent));
namelist[n++] = &tempEnt;
Were you planning on using the memory
or are you just allocating it for the fun of it?
--
Tuomo

... She's dead, Jim. Should we bury her or have some fun?
Boyd, Todd M.
2008-06-05 16:43:14 UTC
Permalink
-----Original Message-----
Sent: Thursday, June 05, 2008 11:01 AM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?
Post by Boyd, Todd M.
namelist[n] = malloc(sizeof(struct dirent));
namelist[n++] = &tempEnt;
Were you planning on using the memory
or are you just allocating it for the fun of it?
It's being used in the code that I posted. When did allocating memory
become fun? :)

"namelist[n++] = &tempEnt;" (which may actually need to just be tempEnt
w/o any address) should, if I am guessing correct, be assigning
namelist[n] the value of tempEnt (and subsequently incrementing "n").

Do you ask because I have semantic errors? I thought I needed to reserve
the space since the number of elements in namelist[] needs to be
dynamic, and they will each be storing a dirent structure.

Have I missed something?


Todd Boyd
Web Programmer
John Brown
2008-06-05 17:30:19 UTC
Permalink
Post by Boyd, Todd M.
Post by Tuomo Latto
Post by Boyd, Todd M.
namelist[n] = malloc(sizeof(struct dirent));
namelist[n++] = &tempEnt;
Were you planning on using the memory
or are you just allocating it for the fun of it?
It's being used in the code that I posted. When did allocating memory
become fun? :)
"namelist[n++] = &tempEnt;" (which may actually need to just be tempEnt
w/o any address) should, if I am guessing correct, be assigning
namelist[n] the value of tempEnt (and subsequently incrementing "n").
Do you ask because I have semantic errors? I thought I needed to reserve
the space since the number of elements in namelist[] needs to be
dynamic, and they will each be storing a dirent structure.
Have I missed something?
Todd Boyd
Web Programmer
Namelist is an array of struct dirent*. You defined namelist as:
struct dirent ** namelist;
Assuming that namelist is defined in your function, then namelist has not
been initialised to a valid pointer. Therefore, the statement
namelist[n] = malloc(sizeof(struct dirent));
will give an undefined result for all values of n. If you define namelist as:
struct dirent ** namelist = NULL;
you will see what I mean.

If you instead write:
struct dirent ** namelist = malloc(sizeof(struct dirent**));
you will have a valid of dirent*, with length == 1.
An assignment to namelist[0] is now legal, but an assignment
to namelist[1] is not, because you are assigning to the second
element, which does not exist.

On each pass through the loop, you have to realloc() namelist
so that it has space for the new dirent*. This is inefficient. You
need to find out how many dirent* there are, and allocate them
up front:
int x = count_dirents();
struct dirent**namelist = malloc(x * sizeof(struct dirent**));
However, it is also inefficient to traverse your directory tree to
get this number, then traverse it again.

If you do not know how many dirent* you need, and you do not
wish to allocate an arbitrary huge number that (you hope) will
never be exceeded, you need to use a data structure that
allows appending. If you are using C++, you can use std::vector.
If you are using C, you will have to make up your own linked list.


_________________________________________________________________
Enjoy 5 GB of free, password-protected online storage.
http://www.windowslive.com/skydrive/overview.html?ocid=TXT_TAGLM_WL_Refresh_skydrive_062008
Tor Lillqvist
2008-06-05 17:56:30 UTC
Permalink
Post by John Brown
you need to use a data structure that
allows appending. If you are using C++, you can use std::vector.
If you are using C, you will have to make up your own linked list.
Or use some library that provides a useful dynamic array data
structure. Like GLib. And there must be others.

--tml
Boyd, Todd M.
2008-06-05 18:26:02 UTC
Permalink
-----Original Message-----
Sent: Thursday, June 05, 2008 12:30 PM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?

---8<--- snip
struct dirent ** namelist;
Assuming that namelist is defined in your function, then namelist has not
been initialised to a valid pointer. Therefore, the statement
namelist[n] = malloc(sizeof(struct dirent));
struct dirent ** namelist = NULL;
you will see what I mean.
This makes sense now. As I said, it's been a while since I've had to
deal with C/C++ and pointers. (I've been programming in Java and
ASP.NET/VB.NET mainly for the last year.)
struct dirent ** namelist = malloc(sizeof(struct dirent**));
you will have a valid of dirent*, with length == 1.
An assignment to namelist[0] is now legal, but an assignment
to namelist[1] is not, because you are assigning to the second
element, which does not exist.
This also makes sense. I was afraid that it would need to be fully
allocated up front... it appears that is the case (dynamic structures
notwithstanding).
On each pass through the loop, you have to realloc() namelist
so that it has space for the new dirent*. This is inefficient. You
need to find out how many dirent* there are, and allocate them
int x = count_dirents();
struct dirent**namelist = malloc(x * sizeof(struct dirent**));
However, it is also inefficient to traverse your directory tree to
get this number, then traverse it again.
For the time being, I will most likely implement a double traversal of
the directory tree. Indeed, it will be inefficient in terms of
iterations per recursion, but it will get the job done for now.
Modifications can (and will) be made after the first (albeit sloppy)
version is finished.
If you do not know how many dirent* you need, and you do not
wish to allocate an arbitrary huge number that (you hope) will
never be exceeded, you need to use a data structure that
allows appending. If you are using C++, you can use std::vector.
If you are using C, you will have to make up your own linked list.
I am using strict C. I have written a linked list using C in the past,
and perhaps I can locate the code once again. However, I'm most likely
going to go with Tor's suggestion of a pre-packaged library, such as
Glib. Hopefully, I won't run into any serious pitfalls getting it to
work with MinGW. (I'm assuming the "G" in "Glib" is for "GNU" the same
as the "G" in "MinGW" is, but perhaps it simply means "General.")

Thank all of you for your assistance! I can see the light at the end of
the tunnel, I've just got a bit more coding to do before I get there.


Todd Boyd
Web Programmer
Tuomo Latto
2008-06-05 20:53:12 UTC
Permalink
Post by Boyd, Todd M.
Post by Tuomo Latto
Post by Boyd, Todd M.
namelist[n] = malloc(sizeof(struct dirent));
namelist[n++] = &tempEnt;
Were you planning on using the memory
or are you just allocating it for the fun of it?
It's being used in the code that I posted. When did allocating memory
become fun? :)
"namelist[n++] = &tempEnt;" (which may actually need to just be tempEnt
w/o any address) should, if I am guessing correct, be assigning
namelist[n] the value of tempEnt (and subsequently incrementing "n").
Do you ask because I have semantic errors? I thought I needed to reserve
the space since the number of elements in namelist[] needs to be
dynamic, and they will each be storing a dirent structure.
Have I missed something?
Well, you're not allocating any space for dirent pointer(s) in namelist.
Anyway, what I was referring to is that after you've allocated memory for
dirent and assigned the pointer to namelist[n] you then go ahead and set
namelist[n] to point to the address of the (unallocated and uninitialized?)
*variable* tempEnt, discarding the pointer to the allocated buffer and
effectively leaking memory.

The tempEnt assignment has a type mismatch.
With the mismatch, treating namelist[n] a pointer to dirent struct and
writing to its members thrashes the stack variables, probably at least
namelist and n.

To copy values of tempEnt to namelist[n], you should copy them verbatim
or maybe even use memcpy() (which is likely to be slower).
Remember, this is C and these are pointers, not objects.

You seem to be using a table there, but maybe a dynamic structure
would be more appropriate. Of course, a table of pointers allocated
with malloc() can be resized with realloc() and pointers can be copied
freely, but there are alternatives.
For C++ there's STL (std::vector etc), but there are existing
implementations in C as well.
For example, FreeBSD has macros for linked lists (3 clause BSD license)
http://nixdoc.net/man-pages/FreeBSD/STAILQ_ENTRY.3.html
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/sys/queue.h
--
Tuomo

... Profanity is the language all programmers know best
Boyd, Todd M.
2008-06-05 21:18:40 UTC
Permalink
-----Original Message-----
Sent: Thursday, June 05, 2008 3:53 PM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?

---8<--- snip
Well, you're not allocating any space for dirent pointer(s) in
namelist.
Anyway, what I was referring to is that after you've allocated memory for
dirent and assigned the pointer to namelist[n] you then go ahead and set
namelist[n] to point to the address of the (unallocated and
uninitialized?)
*variable* tempEnt, discarding the pointer to the allocated buffer and
effectively leaking memory.
The tempEnt assignment has a type mismatch.
With the mismatch, treating namelist[n] a pointer to dirent struct and
writing to its members thrashes the stack variables, probably at least
namelist and n.
To copy values of tempEnt to namelist[n], you should copy them
verbatim
or maybe even use memcpy() (which is likely to be slower).
Remember, this is C and these are pointers, not objects.
I have corrected the code before this message in order to properly
assign the array pointers. I can get the members of the dirent struct
perfectly fine now, but I've run into a different problem.
You seem to be using a table there, but maybe a dynamic structure
would be more appropriate. Of course, a table of pointers allocated
with malloc() can be resized with realloc() and pointers can be copied
freely, but there are alternatives.
Agreed, I should be using a dynamic structure. However, the first
version of the program will most likely just traverse the directory tree
twice--once to count the entries, and the second time to actually grab
their attributes.

At the risk of veering far off-topic (as far as MinGW is concerned),
I'll voice my error if anyone wants to personally reply to my message. I
will try to explain my error with the following code and description:

[code]
struct dirent ** namelist;
int n = 0;

... more code here that counts dir entries into variable "n" using
readdir() ...

namelist = malloc(sizeof(struct dirent) * n);

... code here to reopen directory and start loading file stats ...

int a = 0;
for( ; a < n; a++)
{
namelist[a] = readdir(dirToScan);
}

qsort(namelist, n, sizeof(struct dirent), alphaDirSort);
[/code]

So, that's basically the layout of my program. (At least, it's the
layout of the part that reads the directory tree into an array.) After
some debugging, I've found that a segmentation fault occurs when qsort()
is called. My alphaDirSort() function is defined as:

[code]
int alphaDirSort(const void * a, const void * b)
{
puts("Inside alphaDirSort");

... code code code ...
}
[/code]

...but the "Inside alphaDirSort" is never being displayed. Something
about one of the parameters I've passed to qsort() is causing qsort() to
segfault. I know that namelist[] is being filled with valid dirent
structs, since I've tested it with printf("%s\n", namelist[n]->d_name)
and it worked fine. I also know that the prototype for my alphaDirSort()
function is in line with what qsort() expects to pass array values to.
Am I not passing the proper element size to qsort()? Should I
de-reference namelist?

Again, I'm sorry for getting off-topic. If you feel that posting replies
to my query would further deviate from MinGW discussion, please reply to
me privately (tmboyd1 at ccis dot edu).

Thanks to everyone for all their help!


Todd Boyd
Web Programmer
Tuomo Latto
2008-06-05 21:48:45 UTC
Permalink
Post by Boyd, Todd M.
[code]
struct dirent ** namelist;
[...]
Post by Boyd, Todd M.
qsort(namelist, n, sizeof(struct dirent), alphaDirSort);
^^^^^^^^^^^^^^^^^^^^^

It's an array of pointers, not an array of structs.

Also, your debug prints might have buffering issues.
I prefer fprintf() with stderr and strings ending with a newline.
Using fflush() won't hurt either, except performance-wise.
--
Tuomo

... And now for something completely different...
Boyd, Todd M.
2008-06-05 22:08:24 UTC
Permalink
-----Original Message-----
Sent: Thursday, June 05, 2008 4:49 PM
To: MinGW Users List
Subject: Re: [Mingw-users] Linux dirent->d_type ... Windows
equivalent?
[OT]
Post by Boyd, Todd M.
[code]
struct dirent ** namelist;
[...]
Post by Boyd, Todd M.
qsort(namelist, n, sizeof(struct dirent), alphaDirSort);
^^^^^^^^^^^^^^^^^^^^^
It's an array of pointers, not an array of structs.
Also, your debug prints might have buffering issues.
I prefer fprintf() with stderr and strings ending with a newline.
Using fflush() won't hurt either, except performance-wise.
You're right about the buffering issues. I've been using fprintf(stderr,
x, x) now, and the "Inside alphaDirSort" was displayed. Using this
debugging mechanic, I've discovered that it's how I'm dereferencing the
dirent pointers in alphaDirSort. I've changed it from:

alphaDirSort(const void *, const void *);

to:

alphaDirSort(const struct dirent *, const struct dirent *);

In order to try and ease some of my confusion, but it didn't really do
anything for me except spare me from casting them later. I have tried
several ways trying to access the d_name member. All of them return
garbage rather than the d_name (or are invalid operations).

Given struct dirent * a:

puts(a->d_name); // garbage
puts(a.d_name); // err: not a structure or union
puts((*a).d_name); // garbage
puts((*a)->d_name); // err: invalid type argument of ->
puts((&a).d_name); // err: not a structure or union
puts((&a)->d_name); // err: not a structure or union

Changing the namelist malloc line to reflect pointers to dirents instead
of dirents caused an error and munmap_chunk() to be run. alphaDirSort()
returned the same garbage, regardless of if qsort() was told to handle
pointers or structs. I think I need to do a bit more research on
qsort(). Since I am passing it a valid array that can be accessed in a
"normal" way (using [] index specifiers), there should be some code out
there that will explain to me the proper method for accessing the values
passed through qsort() to whatever comparison function it will use.

I'll do research on my own, as I've obviously missed some critical
information about qsort(), and I'm out of practice dealing with
pointers.

Thanks anyway (and sorry for clogging the listserv),


Todd Boyd
Web Programmer
Tuomo Latto
2008-06-05 22:49:30 UTC
Permalink
Post by Boyd, Todd M.
I'll do research on my own, as I've obviously missed some critical
information about qsort(), and I'm out of practice dealing with
pointers.
That qsort should be ok. More specifically these should work:
struct dirent **namelist;
qsort(namelist, n, sizeof(void*), alphaDirSort);
alphaDirSort(const void *, const void *);
puts(a->d_name);
puts((*a).d_name);

I'd say there's a problem with copying the data and/or memory allocation.
I didn't realize earlier that there's actually a C string (with statically
allocated buffer) inside the struct, so when copying struct contents,
memcpy() should be the best option.

And yes, you should definitely brush up on pointers.
Remember that in C nothing gets dynamically allocated automatically.


I should probably shut up now.
--
Tuomo

... As I let go of my feelings of guilt,
I am in touch with my inner sociopath
-- Ways for Personal Growth
http://www.ericbair.com/humor/PerGrowth.txt
Loading...