This is a new site that's currently running on alpha code. There are going to be bugs. If you discover any, please report them on the site's issues page (GitHub account required). Thanks.
Warning: Many URLs are going to change. Refer to the README file to discover which library project's documentation has been completed.
Applies to: ~>1.0
This how-to involves getting the MD5 hash of Unicode strings which is explained here
There are two possible ways to create a hash of a TStringList object:
Each of these approaches will give a different hash, so you need to decide on one approach and stick to it if you want have repeatable results.
To see the differences, start a new Delphi VCL forms application and drop two edit controls on the form. Create the following FormCreate event handler:
procedure TForm1.FormCreate(Sender: TObject); var D: TPJMD5Digest; MD5: TPJMD5; Strings: TStrings; S: string; begin Strings := TStringList.Create; Strings.Add('The'); Strings.Add('cat'); Strings.Add('sat'); Strings.Add('on'); Strings.Add('the'); Strings.Add('mat'); // 1st approach Strings.LineBreak := #13#10; D := TPJMD5.Calculate(Strings.Text, TEncoding.UTF8); Edit1.Text := D; // 2nd approach MD5 := TPJMD5.Create; try for S in Strings do MD5.Process(S, TEncoding.UTF8); Edit2.Text := MD5.Digest; finally MD5.Free; end; end;
Running this program displays the following values in the edit controls:
c8b029b7698b23a5962e7cc21a75653a(MD5 of Strings.Text)
780c94281a0b1e10395098c690a91d26(MD5 of each string in Strings)
The first approach converts the string list to text, with each line separated by the string stored in the TStrings.LineBreak property. It then uses one of the Unicode overloads of TPJMD5.Calculate to get the required digest. The resulting hash includes the line break characters.
The second approach adds each string from the string list in turn to the same hash. It uses one of the TPJMD5.Process Unicode overloaded methods to do this.
There are advantages and disadvantages of each approach:
The second approach gives the same MD5 hash if you insert empty lines into the string list. This is because an empty string added to a hash makes no difference to it. (Try adding one or more
Strings.Add('xxx'); statements to the above code to check this). The first approach gives a different hash because of the extra line breaks included in the string (providing that the TStrings.LineBreak property is not the empty string).
With the first approach changing the TStrings.LineBreak property will change the hash for the same string list. Therefore you must be careful to ensure that the line break is always the same. (Try changing the line
Strings.LineBreak := #13#10; to
Strings.LineBreak := #10;in the above code to confirm this).
The first approach introduces additional data into the mix (the line breaks) meaning that the hash doesn’t only relate to the list contents.
You must decide which of the approaches to use. If empty lines are not significant I would opt for the second approach as being more “pure”. However if empty lines are significant I would use the first approach.