HDFS Block Concepts
File system Blocks : file system is used to control how data is stored and retrieved. Without a file system, information placed in a storage medium would be one large body of data with no way to tell where one piece of information stops and the next begins. A block is the smallest unit of data that can be stored or retrieved from the disk. Filesystems deal with the data stored in blocks. Filesystem blocks are normally in few kilobytes of size. Even if you try to store a block that has contents less than that of block size still it will occupy the block size on the disk.Blocks are transparent to the user who is performing filesystem operations like read and write. Need of distributed filesystems When a dataset outgrows the storage capacity of a single physical machine, it becomes necessary to partition it across a number of separate machines. Filesystems that manage the storage across a network of machines are called distributed filesystems. Since they are network based