Welcome to the new DelphiDabbler Code Library Documentation.

This is a new site that's currently running on alpha code. There are going to be bugs. If you discover any, please report them on the site's issues page (GitHub account required). Thanks.

Warning: Many URLs are going to change. Refer to the README file to discover which library project's documentation has been completed.

MD5 How-to: How To Get the MD5 Hash of an Array

Applies to: ~>1.0

This how-to assumes you know how and when to use TPJMD5.Calculate and TPJMD5.Process. For details see here.

Several different techniques are needed to get an MD5 hash of an array. Which technique to use depends on both the kind of array (static or dynamic) and on the type of the array’s elements.

This how-to gives several solutions:

TBytes arrays

The simplest case is that of TBytes arrays because TPJMD5 can natively handle getting the hash of a TBytes array.

Here’s an example that uses TPJMD5.Calculate, because it leads to more concise code than using TPJMD5.Process would.

var
  A: TBytes;
  D: TPJMD5Digest;
begin
  A := TBytes.Create(1,2,3,4,5,6,7,8);
  D := TPJMD5.Calculate(A);
  ShowMessage(D); // D is implicitly cast to a string
end;

Learn about casting TPJMD5Digest to a string here.

TBytes is the only array type for which TPJMD5 provides direct support. For all other array types you have to do some more work.

Arrays of simple types

Arrays of simple types (i.e. ordinal and real types) are quite simple to handle.

Each element contains an actual value (not a reference or pointer) and the elements are always contiguous in memory (regardless of use of the packed keyword and the $ALIGN compiler directive [ref]. Therefore we can just get the MD5 of the block of memory occupied by the array. We use the untyped overloads of TPJMD5.Calculate and TPJMD5.Process to do this.

Exactly how we proceed depends on if the array is static or dynamic.

Static arrays

If the array is static (i.e. the size is specified at compile time) code like the following should be used. In this case we are using the untyped overload of TPJMD5.Calculate.

const
  A: array[1..5] of Extended = (0.42, 4.2, 42.0, 420.0, 4200.0);
var
  D: TPJMD5Digest;
begin
  D := TPJMD5.Calculate(A[Low(A)], SizeOf(A));
  ShowMessage(D); // D is implicitly cast to a string
end;

The block of memory containing the array elements starts at the location of the first array element, which is given by Low(A). This is passed as the first parameter to TPJMD5.Calculate. The method’s second parameter requires the size of memory to be processed: we get this from SizeOf function which returns the total size of the array in bytes.

This example uses the Extended real type, but the array element can be any simple type.

Dynamic arrays

If the array is dynamic then a slightly different approach is required because we can’t use SizeOf to get the size of the array at compile time. Here’s the code for dynamic arrays, this time using the ordinal type Word:

type
  TWordArray = array of Word;
var
  D: TPJMD5Digest;
  A: TWordArray;
begin
  A := TWordArray.Create(5,4,3,2,1);
  D := TPJMD5.Calculate(Pointer(A)^, Length(A) * SizeOf(Word));
  ShowMessage(D); // D is implicitly cast to a string
end;

Since dynamic array variables are essentially pointers we get the first parameter of TPJMD5.Calculate by casting A to a pointer and then dereferencing it. The size of the data required as the second parameter is found by multiplying the length of the array by the size of one element.

This technique also works for the TArray<T> generic array type. To check this delete the type definition in the above code then replace all occurrences of TWordArray with TArray<Word>.

Arrays of reference types

If you have an array of reference types, such as strings, objects, pointers and dynamic array you need to iterate all the elements in the array and add each one to the MD5 hash in turn, using whatever mechanism you would normally use to get the hash of the element’s type. This means you can’t use TPJMD5.Calculate because you must be able to add more than one element to the same hash, therefore TPJMD5.Process must be used.

Here is some boilerplate code:

var
  MD5: TPJMD5;
  A: array of TElemType; // could also be TArray<TElemType> or static array
  Elem: TElemType;
begin
  // initialise array here
  MD5 := TPJMD5.Create;
  try
    for Elem in A do
    begin
      // add Elem to MD5 hash here, ultimately calling Process()
    end;
    ShowMessage(MD5.Digest);
  finally
    MD5.Free;
  end;
end;

Here is a concrete example using a dynamic array of AnsiString values:

var
  MD5: TPJMD5;
  A: TArray<AnsiString>;
  Elem: AnsiString;
begin
  A := TArray<AnsiString>.Create('one', 'two', 'three', 'four', 'five');
  MD5 := TPJMD5.Create;
  try
    for Elem in A do
      MD5.Process(Elem);
    ShowMessage(MD5.Digest);
  finally
    MD5.Free;
  end;
end;

And here is a static array example using Unicode strings encoded in UTF-8:

const
  A: array[1..5] of UnicodeString = (
    'one', 'two', 'three', 'four', 'five'
  );
var
  MD5: TPJMD5;
  Elem: UnicodeString;
begin
  MD5 := TPJMD5.Create;
  try
    for Elem in A do
      MD5.Process(Elem, TEncoding.UTF8);
    ShowMessage(MD5.Digest);
  finally
    MD5.Free;
  end;
end;

Note that an array of strings with one or more elements containing the empty string will have the same hash as the same array without any empty elements. Therefore you may wish to concatenate the strings using some known delimiter and get the hash of the resulting string. This method will produce a different hash for arrays with empty elements. See How To Get the MD5 Hash of a String List for a more detailed discussion of this problem.

If you have an array of objects then use the boilerplate code above and see the How to Get the MD5 Hash of an Object how-to for details of how to hash each object.

Arrays of other types

Records

If you have any array of records use the same boilerplate code presented in Arrays of reference types above and get the MD5 hash of each record element using the techniques presented in the How to Get the MD5 Hash of a Record how-to.

Short strings

For an array of Short strings we again iterate the array and use the ShortString overload of TPJMD5.Process to get the hash of each string:

const
  A: array[1..5] of ShortString = (
    'one', 'two', 'three', 'four', 'five'
  );
var
  MD5: TPJMD5;
  Elem: ShortString;
begin
  MD5 := TPJMD5.Create;
  try
    for Elem in A do
      MD5.Process(Elem);
    ShowMessage(MD5.Digest);
  finally
    MD5.Free;
  end;
end;

See Also