-
Notifications
You must be signed in to change notification settings - Fork 641
storage: support resumable uploads #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: support resumable uploads #299
Conversation
e5a72fe to
ba2e8cf
Compare
lib/storage/file.js
Outdated
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
a665191 to
f76c406
Compare
lib/storage/file.js
Outdated
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Overall this looks good. No big issues. RETRY_LIMIT might want to be increased if we put in exponential backoff as suggested. 5 seems like a more sane default (as suggested here). |
76ed7ac to
7bd17d9
Compare
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Recent best practices have emerged, so I figured we should put these in place before merging. I've added a task list in the initial post with the intended revisions so far; more will likely be coming. One of the revisions is allowing a user to specify a preference of a simple or resumable upload. I'm seeking opinions on converting the Current: myBucket.upload("./photo.jpg", myFile, { my: "metadata" }, function (err, file) {})Suggested: myBucket.upload("./photo.jpg", {
destination: myFile,
metadata: {
my: "metadata"
},
resumable: (true || false)
}, function (err, file) {})
Current: myFile.createWriteStream({ my: "metadata" })Suggested: myFile.createWriteStream({
resumable: false, // default: true
metadata: {
my: "metadata"
}
})Any better ideas? |
|
Looks good, but I'd like the user to be able to change the resumable_threshold (that defaults to 5MB). Could we expose a configuration for storage, or add setters for similar values? In the future we might need it for the chunk size as well, and we could use it to allow the user to change the default for createWriteStream at a global level. |
|
Config on the var gcloud = require("gcloud")({ /* conn info */ })
gcloud.storage({ resumableThreshold: n })& var gcloud = require("gcloud")
var storage = gcloud.storage({ /* conn info */, resumableThreshold: n })
|
Bytes. The header is in bytes, so this seems like a simple choice.
Wait, why not? |
|
In a stream, we can't stat a file for its size. It comes to us in small kb chunks, meaning we don't know if it's over a threshold until after we've already formed the request. I suppose if we wanted to, we could buffer n threshold into memory before beginning the request (which is the time we have to say resumable vs simple), but that seems like a dangerous approach. & +1 on bytes. |
|
Fair enough. Plus you don't really know that the readable stream is a file at all. That being said, should resumable even work with streams unless they explicitly give us the filename to use? |
|
That's a great question, but I think it's impossible to answer. Still, I anticipate resumable will be a desirable default, and speaking technically, we have a solution for if we resume an upload, but are sent different data than we were originally (we bail and start a new upload). And in any case, the user knows best what they are doing, so we [will] allow them to be explicit about what type of upload to use at the time of the upload. |
|
Can we get access to the readable stream that is piping their data to our writable stream? In theory, if we can, we could try to yank the |
|
With a stream, we should only be aware of the data coming in, and not about how/where/etc. It would also be a bit magical if we tried to implement something like that. And usually, whenever there's magic, the solution is to add an option or variation of the method that gives the user explicit control of the outcome. We will have both of those things ({ resumable: false } and bucket.upload) |
|
Yeah, that would be too much magic, agreed. Getting back to the original question, I think that it's safe to say that if the developer is uploading using a stream, they know that |
7bd17d9 to
7e1dde8
Compare
uploadandcreateWriteStreambetween simple/resumable upload techniqueupload: Stat the incoming file for size, default to simple for < 5mb, resumable for > 5mbcreateWriteStream: default to resumable uploadsFixes #298
createWriteStreamuses the Resumable Upload API: http://goo.gl/jb0e9D.The process involves these steps:
If the initial upload operation is interrupted, the next time the user uploads the file, these steps occur:
If the user tries to upload entirely different data to the remote file: