This is a new site that's currently running on alpha code. There are going to be bugs. If you discover any, please report them on the site's issues page (GitHub account required). Thanks.
Warning: Many URLs are going to change. Refer to the README file to discover which library project's documentation has been completed.
Project: MD5 Message Digest Unit
Unit: PJMD5
Class: TPJMD5
Applies to: ~>1.0
procedure Process(const X: TBytes; const StartIdx, Count: Cardinal);
overload;
procedure Process(const X: TBytes; const Count: Cardinal); overload;
procedure Process(const X: TBytes); overload;
procedure Process(const Buf; const Count: Cardinal); overload;
procedure Process(const S: RawByteString); overload;
procedure Process(const S: ShortString); overload;
procedure Process(const S: WideString); overload;
procedure Process(const S: UnicodeString; const Encoding: TEncoding); overload;
procedure Process(const S: UnicodeString); overload;
procedure Process(const Stream: TStream; const Count: Int64); overload;
procedure Process(const Stream: TStream); overload;
There a several different overloaded versions of the Process method. Each method adds the data passed to it via its parameters to the current MD5 hash.
The advantages of these methods over the similar Calculate methods are:
The disadvantage of Process is that an instance of TPJMD5 must be created before the method can be used.
Similar groups of methods are described below:
procedure Process(const X: TBytes; const StartIdx, Count: Cardinal);
overload;
procedure Process(const X: TBytes; const Count: Cardinal); overload;
procedure Process(const X: TBytes); overload;
These methods add bytes from a TBytes array to the current hash.
Suppose you have read a file into a byte array and want its MD5 hash. However, to save processing time, if the array is longer that 32Kb you just take the hash of the first and last 16Kb of data from the array. Here’s a function to do that:
function MD5OfArray(const A: TBytes): TPJMD5Digest;
var
MD5: TPJMD5;
const
ChunkSize = 16 * 1024;
MaxSize = 2 * ChunkSize;
begin
MD5 := TPJMD5.Create;
try
if Length(A) > MaxSize then
begin
MD5.Process(A, ChunkSize); // 1st 16Kb
MD5.Process(A, Length(A) - ChunkSize, ChunkSize); // last 16Kb
end
else
MD5.Process(A); // array <= 32Kb, process it all
Result := MD5.Digest;
finally
MD5.Free;
end;
end;
procedure Process(const Buf; const Count: Cardinal); overload;
This method adds Count bytes from untyped buffer Buf to the current hash. Buf must contain at least Count bytes.
Suppose you have two variables, Foo of type Byte and Bar of type Int64 and you need the MD5 checksum of both of them. Here’s the code to do it:
var
Foo: Byte;
Bar: Int64;
MD5: TPJMD5;
begin
Foo := 42;
Bar := -56;
MD5 := TPJMD5.Create;
try
MD5.Process(Foo, SizeOf(Foo));
MD5.Process(Bar, SizeOf(Bar));
MD5.Finalize; // optional
ShowMessage(MD5.Digest); // implicitly casts Digest to string
finally
MD5.Free;
end;
end;
procedure Process(const S: RawByteString); overload;
Adds the ordinal value of all the characters from an ANSI string S to the current hash. S can have any code page.
procedure Process(const S: ShortString); overload;
Adds the ordinal value of all the characters from the ShortString S to the current hash.
procedure Process(const S: WideString); overload;
Adds the ordinal value of all the WideChar characters from the WideString parameter S to the current hash.
procedure Process(const S: UnicodeString; const Encoding: TEncoding); overload;
procedure Process(const S: UnicodeString); overload;
Each of these methods adds data from a Unicode string S to the current hash. Before adding to the hash the string is converted to a sequence of bytes. The first version uses the encoding passed in the Encoding parameter to perform the conversion, while the second version uses the TEncoding.Default encoding.
Suppose you have two text files that have the same text but may have different amounts of white space or different kinds of line endings. You want the MD5 hash to depend only on the words and not the white space.
One solution is to read all the words from a file into a string list, ignoring intervening white space and then build the MD5 hash from the words. Assuming the words are in a string list you can get the MD5 hash as follows using the following function:
function MD5OfStrings(const Words: TStrings): TPJMD5Digest;
var
MD5: TPJMD5;
Word: string;
begin
MD5 := TPJMD5.Create;
try
for Word in Words do
MD5.Process(Word);
Result := MD5.Digest;
finally
MD5.Free;
end;
end;
This code uses the system default encoding of the words in the string, which could mean that different hashes are returned on systems running on different locales. To get round this, use UTF8 (or Unicode) for the encoding. Here’s an example using UTF8:
function MD5OfStrings(const Words: TStrings): TPJMD5Digest;
var
MD5: TPJMD5;
Word: string;
begin
MD5 := TPJMD5.Create;
try
for Word in Words do
MD5.Process(Word, TEncoding.UTF8);
Result := MD5.Digest;
finally
MD5.Free;
end;
end;
procedure Process(const Stream: TStream; const Count: Int64); overload;
procedure Process(const Stream: TStream); overload;
Each of these methods adds bytes from the stream Stream to the current hash. The stream is read from the current position. To read the from the start of the stream set its Position property to 0
. Both methods modify the stream’s Position property.
The first version reads Count bytes from the stream if possible. If Count is greater than number of bytes available then an EPJMD5 exception is raised. The second version reads to the end of the stream, processing Stream.Size - Stream.Position
bytes.
The stream is read into an internal buffer before adding the data to the hash. The buffer’s size is given by the ReadBufferSize property and can be changed by assigning a new value to the property.
Suppose you have a file containing multiple streams or “storages” and you have opened a TStream onto each storage in the file. You want to get a MD5 hash of all of them. However some can be very large so to save processing time you only take the hash of the first 32Kb of each stream.
The following function will do the job: it is passed an array of TStream objects and returns the MD5 digest:
function GetStreamHashes(const Streams: array of TStream): TPJMD5Digest;
var
MD5: TPJMD5;
Stream: TStream;
const
MaxSize = Int64(32 * 1024);
begin
MD5 := TPJMD5.Create;
try
for Stream in Streams do
begin
Stream.Position := 0;
if Stream.Size > MaxSize then
MD5.Process(Stream, MaxSize) // process first 32Kb of stream
else
MD5.Process(Stream); // stream <= 32Kb - process it all
end;
Result := MD5.Digest;
finally
MD5.Free;
end;
end;