I have an ArrayBuffer which is returned by reading memory using Frida. I'm converting the ArrayBuffer to a string, then back to an ArrayBuffer using TextDecoder and TextEncoder, however the result is being altered in the process. The ArrayBuffer length after decoding and re-encoding always comes out larger. Is there a character decoding in an expansive fashion?
How can I decode an ArrayBuffer to a String, then back to an ArrayBuffer without losing integrity?
var arrayBuff = Memory.readByteArray(pointer,2000); //Get a 2,000 byte ArrayBuffer console.log(arrayBuff.byteLength); //Always returns 2,000 var textDecoder = new TextDecoder("utf-8"); var textEncoder = new TextEncoder("utf-8"); //Decode and encode same data without making any changes var decoded = textDecoder.decode(arrayBuff); var encoded = textEncoder.encode(decoded); console.log(encoded.byteLength); //Fluctuates between but always greater than 2,000
TextEncoder are designed to work with text.
To convert an arbitrary byte sequence into a string and back, it's best to treat each byte as a single character.
var arrayBuff = Memory.readByteArray(pointer,2000); //Get a 2,000 byte ArrayBuffer console.log(arrayBuff.byteLength); //Always returns 2,000 //Decode and encode same data without making any changes var decoded = String.fromCharCode(...new Uint8Array(arrayBuff)); var encoded = Uint8Array.from([...decoded].map(ch => ch.charCodeAt())).buffer; console.log(encoded.byteLength);
decoded string will have exactly the same length as the input buffer and can be easily manipulated with regular expression, string methods, etc. But beware that Unicode characters that occupy two or more bytes in memory (e.g. "?") won't be recognizable anymore, as they will result in the concatenation of the characters corresponding to the code of each individual byte.
©2020 All rights reserved.