Optimize Piece-s memory usage#35
Conversation
…entHash instead of separate byte buffer in every Piece object
…ecessary data duplication can be avoided.
|
|
||
| public static byte[] hash(ByteBuffer data) throws NoSuchAlgorithmException { | ||
| MessageDigest md = MessageDigest.getInstance("SHA-1"); | ||
| md.update(data); |
There was a problem hiding this comment.
This will create the byte array internally
There was a problem hiding this comment.
Only if it's a ByteBuffer which doesn't backed by an accessible byte array, and in that case, only create a 4K buffer, which is more memory friendly solution, than to create a byte buffer at least 64K, to accommodate a torrent block just for hashing. I could run the torrent client more efficiently this way, YMMV.
There was a problem hiding this comment.
The buffer being passed is the data read from the "output" file (this._read(...)), which is a direct buffer, i.e. is not array backed, no ? The piecesHashes one is array backed.
I couldn't understand the rest of your answer, sorry. Upon reading the source of the MessageDigest I can see this optimization is there, reusing the byte array internally tempArray, but here md is created and discarded just as well, I think the solution lies more in the lines of #184, of having a "resource store" where you pick it up, use it and relinquish ownership...
There was a problem hiding this comment.
If you look into here : https://github.com/mpetazzoni/ttorrent/pull/35/files#diff-454b627720e29935f25dab9a660670b1L166 you can see a big memory allocation. I don't know how big is it, if it's 64K or 512K - I've implemented it 3.5 years ago - but I'm fairly certain that one torrent piece can be pretty big, and it is much simpler to pass the ByteBuffer directly to the MessageDigest class, instead of allocating a new byte array, copy everything into it, calculate the hash.
Sorry being not too clear, English is not my native language :(
There was a problem hiding this comment.
Hey, I think I can sum up the bytes duplication:
- each byte piece is duplicated from the piecesHashes -> this one you got rid of by hitting the piecesHashes using the offset;
- each file piece is read into a ByteBuffer then copied into a byte array -> this one you got rid of copying, by passing the ByteBuffer (which has a byte array internally) to the MessageDigest mehod;
I think that what remains is limiting the number of existing ByteBuffer buffer from file reading, like I tried in #195 , I'll try to limit it then profile it, with this done I think this patch is complete.
|
|
||
| int torrentHashPosition = getIndex() * Torrent.PIECE_HASH_SIZE; | ||
| for (int i = 0; i < Torrent.PIECE_HASH_SIZE; i++) { | ||
| byte value = this.torrentHash.get(torrentHashPosition + i); |
There was a problem hiding this comment.
Fetching byte per byte on a direct buffer is faster than just fetching the whole thing once ?
Remove a couple of unnecessary fields from the Piece class :