The hex editor used in this is HXD.
Ansi as UTF-8
This means that the file is UTF-8 encoded without a Byte Order Marker (BOM)data:image/s3,"s3://crabby-images/500d0/500d0f8aaee1eded6c6d6ab9014891b518a7d66f" alt="image image"
data:image/s3,"s3://crabby-images/3b347/3b34797d7d7a955bb22f1af2a155dc0797c226b3" alt="image image"
UTF-8
In this example I’ve added an extended character to show how ‘normal chars’ are a single byte while others are multiple bytes.data:image/s3,"s3://crabby-images/813af/813af86bf9874107a26bd0e9cb7c6a48e16b4045" alt="image image"
Here you can see the BOM 0xEF 0xBB 0xBF, then the bulk of the text being stored as a single byte and finally the final character being stored as three bytes 0xC2 0x81 0x42
data:image/s3,"s3://crabby-images/3bf17/3bf17493bb26607ca193deb022a0a8689447e9f7" alt="image image"
UCS2 Little Endian
data:image/s3,"s3://crabby-images/d5f2a/d5f2adaab47524c944c380cb627e7b2a18d2680e" alt="image image"
In this example you can see the following
- The endianess of the word represented by the BOM 0xFF 0xFE.
- Characters being stored as two bytes (16 bit word) e.g. U is stored as 0x55 0x00
- The extended character at the end being stored as two words ( 8 bytes) 0x81 0x00 0x42 0x00
data:image/s3,"s3://crabby-images/671ea/671eabeec27ad9d2146abdc646cb3e7b0a0246d3" alt="image image"
If we delete the BOM then you can see Notepad++ displays the encoding as UCS2 Little Endian without BOM
data:image/s3,"s3://crabby-images/4fb2e/4fb2e1927b3d7ede7d8108045cd7b9e687fd7371" alt="image image"
data:image/s3,"s3://crabby-images/f8d71/f8d71370e5c37350a52811a0d353092c103a621a" alt="image image"