storage: support resumable uploads #299

stephenplusplus · 2014-11-11T22:00:22Z

Allow a choice in upload and createWriteStream between simple/resumable upload technique
upload: Stat the incoming file for size, default to simple for < 5mb, resumable for > 5mb
createWriteStream: default to resumable uploads
Integrate { resumableThreshold: n } option on storage instantiation (defaults to 5mb)
test integrity upload stream
finalize error messaging & objects

Fixes #298

createWriteStream uses the Resumable Upload API: http://goo.gl/jb0e9D.

The process involves these steps:

POST the file's metadata. We get a resumable upload URI back, then cache it with ConfigStore.
PUT data to that URI with a Content-Range header noting what position the data is beginning from. We also cache, at most, the first 16 bytes of the data being uploaded.
Delete the ConfigStore cache after the upload completes.

If the initial upload operation is interrupted, the next time the user uploads the file, these steps occur:

Detect the presence of a cached URI in ConfigStore.
Make an empty PUT request to that URI to get the last byte written to the remote file.
PUT data to the URI starting from the first byte after the last byte returned from the call above.

If the user tries to upload entirely different data to the remote file:

-- same as above --
-- same as above --
-- same as above --
Compare the first chunk of the new data with the chunk in cache. If it's different, start a new resumable upload (Step 1 of the first example).

lib/storage/file.js

ryanseys · 2014-11-13T20:09:42Z

Overall this looks good. No big issues. RETRY_LIMIT might want to be increased if we put in exponential backoff as suggested. 5 seems like a more sane default (as suggested here).

lib/storage/file.js

stephenplusplus · 2014-11-20T01:44:33Z

Recent best practices have emerged, so I figured we should put these in place before merging. I've added a task list in the initial post with the intended revisions so far; more will likely be coming.

One of the revisions is allowing a user to specify a preference of a simple or resumable upload. I'm seeking opinions on converting the upload method to a different signature than what we have currently.

Current:

myBucket.upload("./photo.jpg", myFile, { my: "metadata" }, function (err, file) {})

Suggested:

myBucket.upload("./photo.jpg", {
  destination: myFile,
  metadata: {
    my: "metadata"
  },
  resumable: (true || false)
}, function (err, file) {})

file.createWriteStream() will also need this functionality added.

Current:

myFile.createWriteStream({ my: "metadata" })

Suggested:

myFile.createWriteStream({
  resumable: false, // default: true
  metadata: {
    my: "metadata"
  }
})

Any better ideas?

silvolu · 2014-11-20T02:06:14Z

Looks good, but I'd like the user to be able to change the resumable_threshold (that defaults to 5MB). Could we expose a configuration for storage, or add setters for similar values? In the future we might need it for the chunk size as well, and we could use it to allow the user to change the default for createWriteStream at a global level.

stephenplusplus · 2014-11-20T02:15:21Z

Config on the storage level makes sense to me.

var gcloud = require("gcloud")({ /* conn info */ })
gcloud.storage({ resumableThreshold: n })

&

var gcloud = require("gcloud")
var storage = gcloud.storage({ /* conn info */, resumableThreshold: n  })

What format do we accept for n? (bytes, kb, mb?)
Is it ok to rely on our docs to explain resumableThreshold won't have an effect on createWriteStream uploads? I can see that being a point of confusion

ryanseys · 2014-11-20T02:39:27Z

What format do we accept for n? (bytes, kb, mb?)

Bytes. The header is in bytes, so this seems like a simple choice.

Is it ok to rely on our docs to explain resumableThreshold won't have an effect on createWriteStream uploads?

Wait, why not?

stephenplusplus · 2014-11-20T02:41:49Z

In a stream, we can't stat a file for its size. It comes to us in small kb chunks, meaning we don't know if it's over a threshold until after we've already formed the request.

I suppose if we wanted to, we could buffer n threshold into memory before beginning the request (which is the time we have to say resumable vs simple), but that seems like a dangerous approach.

& +1 on bytes.

ryanseys · 2014-11-20T02:50:34Z

Fair enough. Plus you don't really know that the readable stream is a file at all. That being said, should resumable even work with streams unless they explicitly give us the filename to use?

stephenplusplus · 2014-11-20T02:56:50Z

That's a great question, but I think it's impossible to answer. Still, I anticipate resumable will be a desirable default, and speaking technically, we have a solution for if we resume an upload, but are sent different data than we were originally (we bail and start a new upload).

And in any case, the user knows best what they are doing, so we [will] allow them to be explicit about what type of upload to use at the time of the upload.

ryanseys · 2014-11-20T03:01:16Z

Can we get access to the readable stream that is piping their data to our writable stream? In theory, if we can, we could try to yank the fd (file descriptor) from it and then we could sneakily stat the file to find out its name and size? This is a total shot in the dark though.

stephenplusplus · 2014-11-20T14:46:53Z

With a stream, we should only be aware of the data coming in, and not about how/where/etc. It would also be a bit magical if we tried to implement something like that. And usually, whenever there's magic, the solution is to add an option or variation of the method that gives the user explicit control of the outcome. We will have both of those things ({ resumable: false } and bucket.upload)

ryanseys · 2014-11-20T16:19:03Z

Yeah, that would be too much magic, agreed. Getting back to the original question, I think that it's safe to say that if the developer is uploading using a stream, they know that resumableThreshold won't work. I would only expect it to work if we explicitly give the filename i.e. like in .upload(), so if you can do better than that, that's exceeding expectations in my mind.

stephenplusplus force-pushed the spp--storage-resumable-uploads branch 3 times, most recently from e5a72fe to ba2e8cf Compare November 11, 2014 22:05

stephenplusplus reviewed Nov 11, 2014
View reviewed changes

lib/storage/file.js Outdated

This comment was marked as spam.

Sign in to view

stephenplusplus force-pushed the spp--storage-resumable-uploads branch 2 times, most recently from a665191 to f76c406 Compare November 12, 2014 19:13

ryanseys reviewed Nov 12, 2014
View reviewed changes

dhermes mentioned this pull request Nov 13, 2014

Storage: Support Resumable Uploads googleapis/google-cloud-python#375

Closed

stephenplusplus force-pushed the spp--storage-resumable-uploads branch from 76ed7ac to 7bd17d9 Compare November 14, 2014 15:43

stephenplusplus reviewed Nov 14, 2014
View reviewed changes

lib/storage/file.js

This comment was marked as spam.

Sign in to view

This comment was marked as spam.

Sign in to view

stephenplusplus force-pushed the spp--storage-resumable-uploads branch from 7bd17d9 to 7e1dde8 Compare November 20, 2014 17:13

This was referenced Jan 6, 2024

[Snyk] Fix for 1 vulnerabilities takeratta/google-cloud-node#61

Open

[Snyk] Security upgrade @google-cloud/logging from 1.2.0 to 4.1.0 takeratta/google-cloud-node#63

Open

miguelvelezsa pushed a commit that referenced this pull request Jul 23, 2025

Export GoogleAuth for consumption (#299)

4db29b5

miguelvelezsa mentioned this pull request Jan 14, 2026

migrate code from googleapis/nodejs-logging-bunyan #6991

Closed

miguelvelezsa pushed a commit that referenced this pull request Jan 14, 2026

chore(deps): upgrade to gts 1.0.0 (#299)

962f843

miguelvelezsa mentioned this pull request Jan 14, 2026

migrate code from googleapis/nodejs-logging-bunyan #6995

Closed

miguelvelezsa pushed a commit that referenced this pull request Jan 21, 2026

chore(deps): upgrade to gts 1.0.0 (#299)

4f1125c

miguelvelezsa mentioned this pull request Jan 21, 2026

migrate code from googleapis/nodejs-logging-bunyan #7021

Open

sofisl mentioned this pull request Jan 27, 2026

migrate code from googleapis/cloud-profiler-nodejs #7043

Closed

4 tasks

sofisl pushed a commit that referenced this pull request Jan 27, 2026

Enable prefer-const in the eslint config (#299)

46483d7

This was referenced Jan 27, 2026

migrate code from googleapis/cloud-profiler-nodejs #7044

Closed

migrate code from googleapis/cloud-profiler-nodejs #7045

Closed

migrate code from googleapis/cloud-profiler-nodejs #7046

Closed

sofisl pushed a commit that referenced this pull request Jan 27, 2026

Enable prefer-const in the eslint config (#299)

4071f81

storage: support resumable uploads #299

storage: support resumable uploads #299

Uh oh!

Conversation

stephenplusplus commented Nov 11, 2014

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

ryanseys commented Nov 13, 2014

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

stephenplusplus commented Nov 20, 2014

Uh oh!

silvolu commented Nov 20, 2014

Uh oh!

stephenplusplus commented Nov 20, 2014

Uh oh!

ryanseys commented Nov 20, 2014

Uh oh!

stephenplusplus commented Nov 20, 2014

Uh oh!

ryanseys commented Nov 20, 2014

Uh oh!

stephenplusplus commented Nov 20, 2014

Uh oh!

ryanseys commented Nov 20, 2014

Uh oh!

stephenplusplus commented Nov 20, 2014

Uh oh!

ryanseys commented Nov 20, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants