• 主页
  • 标签
  • 归档
  • 搜索
  • Github

March 01, 2021

filereader api on big files

本文为转载文章, 仅用于自己的知识管理收集, 如果涉及侵权,请联系 suziwen1@gmail.com,会第一时间删除
收集该文章,并非代表本人支持文中观点,只是觉得文章内容容易引起思考,讨论,有它自有的价值

转载自: https://stackoverflow.com/questions/25810051/filereader-api-on-big-files

filereader api on big files

Your application is failing for big files because you're reading the full file into memory before processing it. This inefficiency can be solved by streaming the file (reading chunks of a small size), so you only need to hold a part of the file in memory.

A File objects is also an instance of a Blob, which offers the .slice method to create a smaller view of the file.

Here is an example that assumes that the input is ASCII (demo: http://jsfiddle.net/mw99v8d4/).

  1. 1function findColumnLength(file, callback) { 

  2. 2 // 1 KB at a time, because we expect that the column will probably small. 

  3. 3 var CHUNK_SIZE = 1024; 

  4. 4 var offset = 0; 

  5. 5 var fr = new FileReader(); 

  6. 6 fr.onload = function() { 

  7. 7 var view = new Uint8Array(fr.result); 

  8. 8 for (var i = 0; i < view.length; ++i) { 

  9. 9 if (view[i] === 10 || view[i] === 13) { 

  10. 10 // \n = 10 and \r = 13 

  11. 11 // column length = offset + position of \r or \n 

  12. 12 callback(offset + i); 

  13. 13 return; 

  14. 14 } 

  15. 15 } 

  16. 16 // \r or \n not found, continue seeking. 

  17. 17 offset += CHUNK_SIZE; 

  18. 18 seek(); 

  19. 19 }; 

  20. 20 fr.onerror = function() { 

  21. 21 // Cannot read file... Do something, e.g. assume column size = 0. 

  22. 22 callback(0); 

  23. 23 }; 

  24. 24 seek(); 

  25. 25 

  26. 26 function seek() { 

  27. 27 if (offset >= file.size) { 

  28. 28 // No \r or \n found. The column size is equal to the full 

  29. 29 // file size 

  30. 30 callback(file.size); 

  31. 31 return; 

  32. 32 } 

  33. 33 var slice = file.slice(offset, offset + CHUNK_SIZE); 

  34. 34 fr.readAsArrayBuffer(slice); 

  35. 35 } 

  36. 36} 

The previous snippet counts the number of bytes before a line break. Counting the number of characters in a text consisting of multibyte characters is slightly more difficult, because you have to account for the possibility that the last byte in the chunk could be a part of a multibyte character.

Tagged with 文章 | 转载 | 技术 | 前端 | 性能
Time Flies, No Time for Nuts
Copyright © 2020 suziwen
Build with  Gatsbyjs  and  Sculpting theme