Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node compatibility issue with handling multipart/form-data files #20284

Closed
firecakes opened this issue Aug 25, 2023 · 13 comments · Fixed by #23244
Closed

Node compatibility issue with handling multipart/form-data files #20284

firecakes opened this issue Aug 25, 2023 · 13 comments · Fixed by #23244
Assignees
Labels
bug Something isn't working correctly needs info needs further information to be properly triaged node API Related to various "node:*" modules APIs node compat

Comments

@firecakes
Copy link

firecakes commented Aug 25, 2023

I've been noticing file corruption issues while letting the koa-body library handle multiple files with multipart/form-data requests. In the Deno server code where I'm using the npm modules some files get corrupted during the save while with NodeJS the files are saved fine. I'm using Deno v1.36.3, and the issue can reliably happen as long as you're selecting 20+ images on the browser as a multi-file upload to the server.

Client code that sends the request:

const formData = new FormData();
for (let i = 0; i < event.target.files.length; i++) {
  const file = event.target.files[i];
  formData.append('myFile', file);
}

const results = await axios.post('/api/file', formData, {
  headers: {
    'Content-Type': 'multipart/form-data'
  }
});

Server code in Deno:

import Koa from "npm:[email protected]";
import Router from "npm:@koa/[email protected]";
import serve from "npm:[email protected]";
import { koaBody } from "npm:[email protected]";
import { createReadStream } from "node:fs";
import http from "node:http";


// start the web server initialization
const app = new Koa();
const router = new Router();

app.use(koaBody({
  parsedMethods: ["POST", "PUT", "PATCH", "DELETE"], // add DELETE as parsed method
  multipart: true, // parse multipart form data
  formidable: { // modify where the form data gets saved
    uploadDir: "static/tmp",
    keepExtensions: true,
    maxFileSize: 10 * 1024 * 1024 * 1024,
  },
}));

// add a new file
router.post("/api/file", async (ctx, next) => {
  if (!Array.isArray(ctx.request.files.myFile)) {
    ctx.request.files.myFile = [ctx.request.files.myFile];
  }

  // rename the files to what they were originally and move them to the correct location
  for await (let file of ctx.request.files.myFile) {
    await Deno.rename(
      `static/tmp/${file.newFilename}`,
      `static/files/${file.originalFilename}`,
    );
  }


  ctx.body = {
    files: ctx.request.files.myFile.map((file) =>
      `files/${file.originalFilename}`
    ),
  };
});

// get files under static folder, and resolve "/" to index.html
app.use(serve("static"));

app.use(router.routes());
app.use(router.allowedMethods());

http.createServer(app.callback()).listen(8000);

Server code in NodeJS

import Koa from "koa";
import Router from "@koa/router";
import serve from "koa-static";
import { koaBody } from "koa-body";
import { createReadStream } from "fs";
import http from "http";
import { readdir, rename } from 'node:fs/promises';

// start the web server initialization
const app = new Koa();
const router = new Router();

app.use(koaBody({
  parsedMethods: ["POST", "PUT", "PATCH", "DELETE"], // add DELETE as parsed method
  multipart: true, // parse multipart form data
  formidable: { // modify where the form data gets saved
    uploadDir: "static/tmp",
    keepExtensions: true,
    maxFileSize: 10 * 1024 * 1024 * 1024,
  },
}));

// add a new file
router.post("/api/file", async (ctx, next) => {
  if (!Array.isArray(ctx.request.files.myFile)) {
    ctx.request.files.myFile = [ctx.request.files.myFile];
  }

  // rename the files to what they were originally and move them to the correct location
  for await (let file of ctx.request.files.myFile) {
    await rename(
      `static/tmp/${file.newFilename}`,
      `static/files/${file.originalFilename}`,
    );
  }


  ctx.body = {
    files: ctx.request.files.myFile.map((file) =>
      `files/${file.originalFilename}`
    ),
  };
});

// get files under static folder, and resolve "/" to index.html
app.use(serve("static"));

app.use(router.routes());
app.use(router.allowedMethods());

http.createServer(app.callback()).listen(8000);
@bartlomieju bartlomieju added bug Something isn't working correctly node compat labels Aug 27, 2023
@firecakes
Copy link
Author

I can be more specific. I was testing uploading PNG files and then trying to read them back in the browser, and this is how I was noticing the data corruption issues, where the data in some of the files gets jumbled up and is unrecognizable as an image file.

The chance of each file being corrupted also appears to be correlated to its file size. For example, a bunch (15+) of images at 305B each had no corruptions. A bunch of 3.5KB images had only one corruption, and a bunch of large 2.1MB images had every image except for one corrupted.

@CoenraadS
Copy link

CoenraadS commented Aug 31, 2023

Updating file size to 5MB did the trick

Reproduction (Windows 10):

Node v20.5.1

npm ci
node .\main.js

Note all hashes match


Deno v1.36.4 (release, x86_64-pc-windows-msvc)

deno run --allow-all ./main.js

Note second hash mismatch


repro.zip

Removing fs.writev helps induce the bug after only 2 iterations, (See main.js@50

This is my analysis so far

  1. In node_modules\.deno\[email protected]\node_modules\formidable\src\VolatileFile.js, put a conditional breakpoint on the this._writeStream.write@line 64 => this.newFilename.endsWith("1.bin") (This will cause breakpoint when second file starts to write)
  2. Start debugger, wait until previous breakpoint is hit
  3. Step down through the code until end up at _stream.mjs@3930 (state.ended || state.reading || state.destroyed || state.errored || !state.constructed)
  4. Note that it will write to state.buffered, because !state.constructed is false
  5. Run until breakpoint, now we are at second write iteration of the second file
  6. Note that it will write to state.buffered, because !state.constructed is false
  7. Note that buffer still has the previous data from iteration 1 in it also
  8. Run until breakpoint, now we are at third write iteration of the second file
  9. Note now the state.constructed is true, we will bypass the buffer
  10. Note that buffer still has the previous data from iteration 1 & 2 in it
    (An easier way to check this is conditional breakpoint on _stream.mjs@3944 => break when state.buffered.length > 0
  11. Data corrupted, buffer was not cleared before this third write (I checked in hex editor, bytes of the first write are from the third iteration)

image

cc @bartlomieju This is a reproducible data corruption issue

@bartlomieju
Copy link
Member

Thanks for the report folks. We fixed a big bug in Socket class in v1.36.4, is it still reproducible in the latest version?

@CoenraadS
Copy link

@bartlomieju Yes, also occurs with deno 1.36.4 (release, x86_64-pc-windows-msvc)

@bartlomieju
Copy link
Member

Thanks, we'll look into that!

@firecakes
Copy link
Author

Any updates on this @bartlomieju? Also tried this with v1.37.2.

@firecakes
Copy link
Author

Update: I haven't had issues with this since running newer versions of Deno (v1.38.3+). Seems like it got resolved, just wanted to make sure it was intentional before closing the issue.

@CoenraadS
Copy link

CoenraadS commented Jan 27, 2024

I tested my repro from #20284 (comment) with 1.40.2 and it still occurs.

However the 'happy path' when fs.writev is available seems to work ok.

.e.g.

Ok:
const fileStream = fs.createWriteStream(vf.filepath);

Potential corruption:
const fileStream = fs.createWriteStream(vf.filepath, { fs: { open: fs.open, write: fs.write, close: fs.close } }); // Removed fs.writev

@bartlomieju bartlomieju added the node API Related to various "node:*" modules APIs label Mar 4, 2024
@kt3k
Copy link
Member

kt3k commented Mar 27, 2024

@CoenraadS I couldn't reproduce the data corruption using the repro.zip (main.js) with the latest Deno on Windows Server 2022 and MacOS 13.3. Can you still reproduce the error with the latest Deno?

Screenshot 2024-03-27 at 22 36 48

If you still can, how likely does it happen on your end?

@kt3k kt3k added the needs info needs further information to be properly triaged label Mar 28, 2024
@dy-dx
Copy link

dy-dx commented Mar 29, 2024

A createWriteStream issue has just been fixed in node.js, but it still exists in deno.
It might be related to this issue.

Repro:

import fs from 'node:fs';
import process from 'node:process';

const w = fs.createWriteStream('file.txt');

w.on('open', () => {
  w.write('hello');

  process.nextTick(() => {
    w.write('world');
    w.end();
  });
});

w.on('close', () => {
  console.log(fs.readFileSync('file.txt', 'utf8'));
});

deno v1.42.0:

$ deno run --allow-all index.mjs

worldhello

@CoenraadS
Copy link

CoenraadS commented Mar 30, 2024

@kt3k

It still occurs for me, sending 2 files I think isn't enough, try bumping it up to a higher number. Usually within the first 10 I get a hash error.

image

I will reiterate though that the default happy path works ok, only when removing fs.writev does it occur (while node passes), which I only did while experimenting with creating a repro, not using this in real code.

Ok:
const fileStream = fs.createWriteStream(vf.filepath);

Potential corruption:
const fileStream = fs.createWriteStream(vf.filepath, { fs: { open: fs.open, write: fs.write, close: fs.close } }); // Removed fs.writev

@kt3k
Copy link
Member

kt3k commented Apr 3, 2024

@CoenraadS Thanks. Now I'm able to reproduce the issue consistently with ~20 iterative checks on macOS.

The issue (with fs.writev excluded) seems to have started from Deno 1.39.3. It doesn't happen with 1.39.2 or lower. Update: I used 50 sendFiles for checking it, but now I found the check has high chance of false positives, and 1.39.2 and lower version still seem having the issue.

I'll check it in more details soon

@kt3k
Copy link
Member

kt3k commented Apr 5, 2024

@dy-dx Thanks for the input! Looks related to this issue. I tried the patch nodejs/node#52005 together with nodejs/node#46818, but it didn't fix the given example (it still shows worldhello).. Do you think of any possible cause of it? (Maybe our Node stream polyfills are out-dated somehwere..) Nevermind (I was modifying unrelated file.. 🤦 ). Appliying nodejs/node#52005 seems fixing the given example, and also this issue.

kt3k added a commit that referenced this issue Apr 8, 2024
This PR follows this fix (nodejs/node#52005) in
Node.js.

Stream's construct callback happens one tick earlier by this change, and
it prevents the reordering of the first few chunks in
`node:stream.Writable`

closes #20284
satyarohith pushed a commit that referenced this issue Apr 11, 2024
This PR follows this fix (nodejs/node#52005) in
Node.js.

Stream's construct callback happens one tick earlier by this change, and
it prevents the reordering of the first few chunks in
`node:stream.Writable`

closes #20284
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly needs info needs further information to be properly triaged node API Related to various "node:*" modules APIs node compat
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants