Abstractions, limits of application
There are written a lot about it, but seemt not enough. :)
I talk about using of high level API without any ideas how it works inside. And why it is important
to think about application area of your code before writing.
With real examples.
How to legally damage HDD with user software
Today I recovered data from still one HDD with single bad block. It it appeared in an interesting place - this was location
of FireFox context save file metadata. It is interesting because of massive read/write attemts in the same place.
Since NTFS by default updates last access data, write operations are performed even more frequently than author may imagine.
I have my own statistics about kind of data, located on damaged sectors.
These are frequently updated config files, context save files, database files with high UPDATE rate. We need about a year or two
to completly damage sector with reguler writes.
Simple algorithm: create file, then overwrite it:
Proper way is more complicated. There are several solutions:
- sh make_config.sh > /etc/my.conf
- disk driver or special utility monitors surface state and inform file system (FS) driver or initiate
relocation of file data to other place of disk.
I don't know any implementation.
- file system driver or special utility may monitor update rate and initiate frequently updated data relocation.
There are some FS with such functionality. Latest revisions of UDF FS declares mechanisms of surface usage management.
ZFS uses Copy-On-Write (CoW) algorithm, which lead to regular wear of disk.
- Smart algorithm in the software:
- create new file, work with it, rename old one, rename new to original name, remove old.
- notify FS about type of files being created, e.g. use TEMPORARY flag, use special API or even memory disk,
this may reduce physical disk access.
- minimize disk access, use system fiule cache or implement own, it depends on your application.
- monitor number of writes and relocate data yourself.
- Defragmentstion partially solve this problem, but only as side-effect.
How to legally slowdown or even freeze powerful PC with user software
Editors, viewers, data recovery and maintainance tools, mailers often operates slowly, freeze or crash when try
to work with huge files. The most popular reason is
Memory mapping (file to memory). In general, this is perfect and convenient mechanism. It is used
by many editors. Also, OS file caching subsystem use it. Part of or entire file is mapped to address space of the
OS. Then regular paging mechanism is used. Memory blocks being accessed are automatically read to physical memory,
unused are written back to disk and memory is freed.
And now lets talk about internals. In order to map file to memory OS have to build and maintain table containing relation
between memory addresses and positions in file, modification marks and so on. There is not problem with small and medium
files. But for hundreds of MB mapping consumes significant amount of memory. Also, OS have to allocate contiguous
block of memory addresses. There is no problem under 64-bit OSes, but in 32-bit system we have only 4GB at all and only
about 1 or 2 GB available for file mapping. Appears, that we meet some unexpected conditions when try to open big file
in regular way.
So, if you are about to work with bug files, do not think that use of 64-bit file pointers would be enough.
You will also need some special algorithm for data access and caching and take into account OS specifics.
Or check file size on entry and inform user that file is too beg to be processed. It would be much better than
unpredictable behavior of the application.