Node.js从字符串生成文件流的实现方法

程序员文章站 2022-04-09 12:40:29

一.背景在文件相关的数据加工等场景下，经常面临生成的物理文件应该如何处理的问题，比如：生成的文件放到哪里，路径存在不存在？临时文件何时清理，如何解决命名冲突...

一.背景

在文件相关的数据加工等场景下，经常面临生成的物理文件应该如何处理的问题，比如：

生成的文件放到哪里，路径存在不存在？

临时文件何时清理，如何解决命名冲突，防止覆盖？

并发场景下的读写顺序如何保证？

……

对于读写物理文件带来的这些问题，最好的解决办法就是不写文件。然而，一些场景下想要不写文件可不那么容易，比如文件上传

二.问题

文件上传一般通过表单提交来实现，例如：

var formdata = require('form-data');
var fs = require('fs');

var form = new formdata();
form.append('my_file', fs.createreadstream('/foo/bar.jpg'));
form.submit('example.org/upload', function(err, res) {
 console.log(res.statuscode);
});

（摘自 form-data ）

不想写物理文件的话，可以这样做：

const formdata = require('form-data');

const filename = 'my-file.txt';
const content = 'balalalalala...变身';

const formdata = new formdata();
// 1.先将字符串转换成buffer
const filecontent = buffer.from(content);
// 2.补上文件meta信息
formdata.append('file', filecontent, {
 filename,
 contenttype: 'text/plain',
 knownlength: filecontent.bytelength
});

也就是说，文件流除了能够提供数据外，还具有一些 meta 信息，如文件名、文件路径等，而这些信息是普通 stream 所不具备的。那么，有没有办法凭空创建一个“真正的”文件流？

三.思路

要想创建出“真正的”文件流，至少有正反 2 种思路：

给普通流添上文件相关的 meta 信息

先拿到一个真正的文件流，再改掉其数据和 meta 信息

显然，前者更灵活一些，并且实现上能够做到完全不依赖文件

文件流的生产过程

沿着凭空创造的思路，探究 fs.createreadstream api 的之后发现，生产文件流的关键过程如下：

function readstream(path, options) {
 // 1.打开path指定的文件
 if (typeof this.fd !== 'number')
  this.open();
}

readstream.prototype.open = function() {
 fs.open(this.path, this.flags, this.mode, (er, fd) => {
  // 2.拿到文件描述符并持有
  this.fd = fd;
  this.emit('open', fd);
  this.emit('ready');
  // 3.开始流式读取数据
  // read来自父类readable，主要调用内部方法_read
  // ref: https://github.com/nodejs/node/blob/v10.16.3/lib/_stream_readable.js#l390
  this.read();
 });
};

readstream.prototype._read = function(n) {
 // 4.从文件中读取一个chunk
 fs.read(this.fd, pool, pool.used, toread, this.pos, (er, bytesread) => {
  let b = null;
  if (bytesread > 0) {
   this.bytesread += bytesread;
   b = thispool.slice(start, start + bytesread);
  }
  // 5.（通过触发data事件）吐出一个chunk，如果还有数据，process.nexttick再次this.read，直至this.push(null)触发'end'事件
  // ref: https://github.com/nodejs/node/blob/v10.16.3/lib/_stream_readable.js#l207
  this.push(b);
 });
};

p.s.其中第 5 步相对复杂， this.push(buffer) 既能触发下一个 chunk 的读取（ this.read() ），也能在数据读完之后（通过 this.push(null) ）触发 'end' 事件，具体见 node/lib/_stream_readable.js

重新实现文件流

既然已经摸清了文件流的生产过程，下一步自然是 替换掉所有文件操作，直至文件流的实现完全不依赖文件，例如：

// 从文件中读取一个chunk
fs.read(this.fd, pool, pool.used, toread, this.pos, (er, bytesread) => {
 /* ... */
});

// 换成
this._fakereadfile(this.fd, pool, pool.used, toread, this.pos, (bytesread) => {
 /* ... */
});

// 从输入字符串对应的buffer中copy出一个chunk
readstream.prototype._fakereadfile = function(_, buffer, offset, length, position, cb) {
 position = position || this.input._position;
 // fake read file async
 settimeout(() => {
  let bytesread = 0;
  if (position < this.input.bytelength) {
   bytesread = this.input.copy(buffer, offset, position, position + length - 1);
   this.input._position += bytesread;
  }
  cb(bytesread);
 }, 0);
}

即从中剔除文件操作，用基于字符串的操作去替代它们

四.解决方案

如此这般，就有了，用来凭空创建文件流：

string2filestream('string-content') === fs.createreadstream(/* path to a text file with content 'string-content' */)`

例如：

const string2filestream = require('string-to-file-stream');

const input = 'oh, my great data!';
const s = string2filestream(input);
s.on('data', (chunk) => {
 assert.equal(chunk.tostring(), input);
});
生成的流同样能够具有文件 meta 信息：

const string2filestream = require('string-to-file-stream');

const formdata = new formdata();
formdata.append('file', string2filestream('my-string-data', { path: './abc.txt' }));
form.submit('example.org/upload', function(err, res) {
 console.log(res.statuscode);
});

足够以假乱真

参考资料

fs.createreadstream(path[, options])

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持。

上一篇： node中实现删除目录的几种方法

下一篇： nodejs开发一个最简单的web服务器实例讲解

Node.js从字符串生成文件流的实现方法

基于Vue-Cli 打包自动生成/抽离相关配置文件的实现方法

python实现生成Word、docx文件的方法分析

PHP生成指定随机字符串的简单实现方法

Node.JS段点续传：Nginx配置文件分段下载功能的实现方法

php实现根据字符串生成对应数组的方法

Node.js + express实现上传大文件的方法分析【图片、文本文件】

python处理文本文件实现生成指定格式文件的方法

PHP实现HTML生成PDF文件的方法

利用node.js实现自动生成前端项目组件的方法详解

Python实现将MySQL数据库表中的数据导出生成csv格式文件的方法