Dialog Box OptionsFuzzy matching in changed diff blocks
Fuzzy matching enables ExamDiff Pro to intelligently align lines in changed blocks that are similar but not identical. The options in this section control when fuzzy matching is activated and how selective it is.
- Do not perform fuzzy matching
Selecting this option always disables fuzzy matching.
- Always perform fuzzy matching
Selecting this option always enables fuzzy matching. This option is not recommended if you are comparing huge files, as it can potentially greatly slow down comparison.
- Perform fuzzy matching only if both files are smaller than
Selecting this option enables fuzzy matching only if the files compared are both smaller than the specified size (250KB by default).
- Matching method
Use this option to control how the matching of similar lines is performed. The default is Characters, and it suitable for most uses. Choose Words if your files have many distinguishable words.
- How similar must lines be to allow fuzzy matching?
This option determines the percentage similarity that two lines must have in order to be aligned if fuzzy matching is turned on. Keep in mind that if a line has multiple potential matches within a diff block, the match will be selected that maximizes the total similarity of all the fuzzy matches in the block. Our users typically find a threshold between 40% and 80% to be effective in producing the clearest alignments for most use cases.
Selecting 100% similarity ("No fuzzy matching") effectively disables fuzzy matching, because only exactly identical lines could be fuzzy-matched. Since changed blocks have no pairs of identical lines, no fuzzy matching will occur.
Selecting 0% similarly ("Match any lines") will align the lines in each block maximize the total similarity of all the fuzzy matches in the block, even if this configuration results in some lines being matched with completely dissimilar lines.
- Automatically detect text/binary files
Allow ExamDiff Pro to determine whether the compared files are text or binary.
- Always treat these files as text
Files matching any name filter in this comma-separated set will always be treated as text files.
- Always treat these files as binary
Files matching any name filter in this comma-separated set will always be treated as binary files, unless they also match one of the filters in the Always treat these files as text set (text takes precedence over binary).
- Always treat these files as text
- Treat all files as text
Compare all files as text and show the results in text format.
- Treat text files as binary
Compare all files as binary and show the results in HEX format.
This option is intended for advanced users only. You can use it to choose between six different diff algorithms.
The Classic diff algorithm is the algorithm used in ExamDiff Pro prior to version 12.0. It's a variation of the Myers algorithm.
The next four algorithms (Myers, Minimal, Patience, and Histogram) are implemented by the LibXDiff open-source library.
In general, we have found that ExamDiff Pro's Classic algorithm (which is itself a heavily modified version of Myers) gives the best results in most situations, but it's possible that one of the alternative diff algorithms could give you better results in some cases. This paper by Nugroho, Hata, and Matsumoto gives a good overview of the Myers, Minimal, Patience, and Histogram algorithms, and this blog post by Lup Peng offers some examples of when each of these four algorithms can be helpful.
Finally, in line-by-line comparison every line is matched to the line at the same position in the opposite file (i.e. line 50 in the first file will match with line 50 in the second file, ignoring their contents.) One exception is if manual synchronization links are used, in which case line-by-line comparison will be broken up by the links.
Optimize diff block alignment
With this option turned on, ExamDiff Pro uses a heuristic for determining the boundaries of diff blocks, based on the open-source work done in diff-slider-tools.