Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think there's an O(n^2) algorithm in there. I just created a directory with 100,000 entries. Listing it (from Cygwin, no less, using 'time ls | wc') takes 185 milliseconds. The directory is on a plain-jane 7.2k 1TB drive, though of course it's hot in cache from having been created. 'dir > nul', mind you, is quite a bit slower, at over a second.


You only did one test, so you have no idea what the complexity curve is. Do at least three tests, with 1000, 10,000 and 100,000 entries and graph the results. Three tests is still pretty skimpy to figure out what the curve is, so do tests at 10 different sizes.

Also, Joel's complaint was about the Windows Explorer GUI (specifically, opening a large recycle bin takes hours). Cygwin `ls` is using a completely different code path. Your experiment does suggest that Joel's problem is in the GUI code, though, and not the NTFS filesystem code.


Oh, the OS treeview is dreadful, everyone who's seriously coded on Windows knows that.

As to actual complexity curve (which, knowing what I do about NTFS, I'm fairly sure is O(n log n)), I don't really care about it; since it hasn't shown up in a serious way at n=100000, it's unlikely to realistically affect anyone badly. Even if 1 million files (in a single directory!) took 18.5 seconds, it wouldn't be pathological. Other limits like disk bandwidth and FS cache size seem like they'd hit in sooner.


For all you know you're seeing the HDD cache and not any kind of filesystem caching. evmar mentions SSD making a difference for workloads that should fit in RAM, which means HDD caching also would, for a very modest workload.


I know it's not the HDD cache. I can monitor physical disk requests vs logical filesystem requests (diskmon vs procmon). But I repeated it anyway on my SSD, and the results are the same.


Which version of Windows? In my experience certain things will freeze up explorer XP/Server2k3 effectively indefinitely while on Vista/7/Server2k8 there is a progress meter and everything is stable.


Windows 7 x64. But I am talking about the command-line, not the UI. The UI controls (whether in Explorer or in other apps) generally don't react well to non-human sized input.


I think the recursive search is the thing that is broken.


I think the fact that is hot in cache may have influenced things here.


The OP specifically mentions doing it twice to make sure the cache is hot: "Do it twice and time how long it takes. The reason to time only the second run is to give the OS a chance to cache data in ram"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: