Q. I’m implementing a software on top of ext4 file system, which will have thousands of directories and inside each of the directory there will be thousands of regular file. During the lifetime of the software the files inside the directories keep growing. As a result the blocks of directories get spread across the entire filesystem. This leads to poor performance for the directory traversal.
To fix this issue I wanted to have all the blocks of any of the directory allocated contiguously on the file system. For this I wanted a mechanism to preallocate directories on ext4 file system.
ext4 file system supports preallocation of regular files using system call fallocate. But there is no support for preallocation of blocks for a directory. Why this is so? I guess no one thought it is a required feature.
Anyway this can be implemented by an unrelated characteristic of ext4 filesystem. That is, ext4 never shrinks the directory size. That means once a block is allocated to a directory, ext4 never reclaims it back from the directory. We can exploit this feature to preallocate the blocks for a directory in the ext4 file system.
For example I’ve a directory foo, I’m anticipating that over the time this directory will contain 60000 files of average filename upto 48 characters. To preallocate data blocks for foo directory to hold these futuristic data entries, take following steps:
1. Create foo directory
2. Inside foo directory, create zero byte 60000 files of 48 character file names.
3. Delete all these 60000, zero byte files.
4. At the end of this you create foo directory, which has preallocated blocks for its futuristic need. Delayed allocation of ext4 makes it sure that these blocks are (mostly) contiguous.
cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 48 | head -n 60000 > /tmp/files.lst
cat /tmp/files.lst | xargs touch
rm -rf *