PeaZip
PeaZip
64 bit

PeaZip
Portable

Linux / BSD
rar utility
zip utility find identical files similar files file unicity



identify similar files

PeaZip, free file archiver utility, Open Source WinRar / WinZip alternative software providing unified cross-platform portable GUI for tools as 7-Zip, FreeArc, PAQ, UPX.
Create 7Z, ARC, BZ2, GZ, PAQ, PEA, self-extracting archives, TAR, WIM, XZ, ZIP files
Open and extract 150+ archive types: ACE, CAB, DMG, ISO, RAR, UDF, ZIPX files...
Features includes: extract, create and convert multiple archives at once, split / join files, strong encryption with two factor authentication, encrypted password manager, secure delete, find duplicate files, calculate hash value, export job definition as script to automate backup / restore.

search duplicate files




redundant data deduplicate files search duplicate file save disk space occupation reduce redundant files
view duplicate files


Learn more  |  Change log  |  Screenshots  |  Benchmarks  |  Reviews  |  Add-ons  |  Support  |  FAQ  |  Donations

Find duplicate files, hash, checksum utility

rar files

How to detect identical files, calculate checksum and hash, and compare files

Check for identical or similar data





How to detect identical files, calculate checksum and hash, and compare files


PeaZip can be used as duplicate files finder, searching binary identical files in order to de-duplicate data for saving disk space occupation, or improving compression through elimination of redundancy of input information, using various detection methods:

Identify duplicates in the file manager

When browsing a filesystem (not inside a compressed archive) the file browser can show file checksum / hash value on demand in last column, allowing to identify binary identical files which have same checksum/hash value.
Clicking the name of the function (in context menu, "File tools" group) will display hash or checksum value for all (or selected) files.
Clicking "Find duplicates" will display size and hash or checksum value only for duplicate files - same binary identical content featured in two or more distinct files - and report the number of non-unique files identified.
In both cases, sorting for CRC column allows to group all files (in same folder, or same search filter) with identical hash or checksum.
The verification function can be set in main application's menu: Organize, Browser, Checksum/hash), a wide selection of algorithms can be selected, ranging from simple checksum functions as Adler32, CRC family (CRC16, CRC24, CRC32, and CRC64) to hash functions like eDonkey/eMule, MD4, MD5, and cryptographically strong hash as Ripemd160, SHA-1 and SHA-2 (SHA224, SHA256, SHA384, SHA512), and Whirlpool512.

indentify redundant data

When browsing an archive this on demand verification is not available, but (if supported by the archive format) the CRC column will display data integrity information, i.e. CRC32 in ZIP archives, allowing to sort archive content by CRC column to group identical files and find out duplicates.


Identify similar images in the file manager

When browsing a filesystem, PeaZip can display thumbnails of graphic files: context menu, organize, check show picture thumbnails, or select a file browser's preset style showing thumbnails.
While checksum/hash based inspection allows to find exactly identical files (and images), thumbnails allows the user to find similar images (i.e. same picture or graphic saved in different formats, or with different color depth or compression settings, or scaled to different sizes), to help in deciding if the (pseudo)duplication is acceptable, and what copy to keep.


Calculate multiple checksum and hash functions at once

Check files utility in "File tools" submenu (context menu) allows to verify multiple hash and checksum algorithms (same featured for the file manager) on multiple files at once, e.g. to compare a group of file to identify redundant ones, or to check files for corruption when an original checksum or hash value is known.

Use of multiple functions, and especially relying on cryptographically strong hash functions as Ripemd, SHA-2 or Whirlpool, can defeat attempt of forging identical-looking files, as it is computationally feasible to find a collision (different input mapped to same output) for simpler checksum and hash functions.
This way, even a purposely crafted modification of a file would not pass unnoticed to most sophisticated detection algorithms.

Output value of hashes and checksums can be seen as exadecimal (HEX, either LSB or MSB) or encoded as Base64.


Alternative: byte-to-byte comparison

Compare files utility in "File tools" submenu performs byte to byte comparison between two files; unlike checksum/hash method it is not subject of collisions under any circumstance, and can report what the different bytes are - so it not only tells if two files are not identical, but also what changes were made between the two versions.

Topics: find duplicate files, deduplicate files, compare files, detect redundant data, identical files, checksum, hash, duplicate finder tool, detect duplicates, similar files, check differences, file hasing, checksum utility.

Related articles: How to improve file compression performances, Verify file hash and checksum, How to encrypt archives, Extract encrypted archives, Secure delete, Split and join files, Share files

FAQ > File management > Find file hash, calculate checksum, deduplicate data







Tag Cloud
checksum utility compare files CRC32 deduplicate redundant data find duplicate files hash value MD5 remove identical files SHA256 verify file unicity


search similar filesDownloads
free hash tool downloads
PeaZip
Peazip 64 bit
Peazip Portable
Linux/BSD

free rarHelp
identify duplicate data

Learn more
Change log
Screenshots
Benchmarks
Reviews
Add-ons
Support
FAQ
extract rar freeDonations
search redundant files
Support PeaZip project, or donate to FAO, UNICEF and UNESCO from donations' page

© PeaZip srl, TOS and Privacy
Giorgio Tani
Search
winzip alternative
find information