Reland of Improve linearized pdf load/show time."

Use received bytes count as the value for progress, not chunks count.

Reason for reland: 
The fix of revert reason has been added.

Bug: 755061

Original change's description:
> Revert "Reland of Improve linearized pdf load/show time."
>
> This reverts commit 9a6d1487991b9824b499aff0ce1c9b68834a14cd.
>
> Reason for revert: https://crbug.com/755061. This appears to be
> causing a serious regression in loading. The loading bar does not
> appear as expected and appears all at once at the end as the screen
> displays. There is 5-6seconds of nothing happening which makes it look
> like the browser has stopped working.
>
>
> Original change's description:
> > Reland of Improve linearized pdf load/show time.
> >
> > XFA forms loading has been fixed.
> > Now for document with single non XFA page, the form load first.
> > This is necessary for correct loading pages, because in XFA document
> > the page count and them contents may be changed after loading form.
> >
> > See
> > https://codereview.chromium.org/2558573002/
> >
> > For test this:
> >  build chromium pdf with XFA support
> >  and open any document from
> >  https://www.idrsolutions.com/jpdfforms/xfa-html5-example-conversions/
> >
> > Original CL:
> >  https://codereview.chromium.org/2455403002/
> >
> > Original description:
> >  Improve linearized pdf load/show time.
> >  Reduce Pdf Plugin's count of reconnects.
> >  Add tests for PDFPlugin DocumentLoader.
> >
> >  DocumentLoader was splitted into separate components, and missing tests was added for them.
> >
> >  The main ideas in this CL are:
> >
> >  1) Do not reset browser initiated connection at start (includes case when we can use range requests), if we request data near current downloading position.
> >  2) Request as much data as we can on each request, and continue loading data using current range request. (like tape rewind)
> >  3) Isolate RangeRequest logic into DocumentLoader. Method OnPendingRequestComplete is called, when we receive requested data (main connection, or Range connection). (like tape playing without rewing).
> >  4) Fill this logic by tests.
> >
> >  Example URL:
> >  http://www.major-landrover.ru/upload/attachments/f/9/f96aab07dab04ae89c8a509ec1ef2b31.pdf
> >  Comparison of changes:
> >  https://drive.google.com/file/d/0BzWfMBOuik2QNGg0SG93Y3lpUlE/view?usp=sharing
> >
> > Change-Id: I97bb25d2e82bcb4ba2e060af8128f49b9c0680d9
> > Reviewed-on: https://chromium-review.googlesource.com/581292
> > Reviewed-by: Robert Sesek <[email protected]>
> > Reviewed-by: Dan Sinclair <[email protected]>
> > Commit-Queue: Art Snake <[email protected]>
> > Cr-Commit-Position: refs/heads/master@{#489755}
>
>

[email protected]

Change-Id: I78e10565f639c26faae29b3cf854419208af8665
Reviewed-on: https://chromium-review.googlesource.com/615302
Commit-Queue: Lei Zhang <[email protected]>
Reviewed-by: (000 09-08 - 09-18) dsinclair <[email protected]>
Cr-Commit-Position: refs/heads/master@{#501344}
diff --git a/pdf/url_loader_wrapper_impl.cc b/pdf/url_loader_wrapper_impl.cc
new file mode 100644
index 0000000..b7bc808
--- /dev/null
+++ b/pdf/url_loader_wrapper_impl.cc
@@ -0,0 +1,325 @@
+// Copyright 2016 The Chromium Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+#include "pdf/url_loader_wrapper_impl.h"
+
+#include "base/logging.h"
+#include "base/memory/ptr_util.h"
+#include "base/strings/string_util.h"
+#include "base/strings/stringprintf.h"
+#include "net/http/http_util.h"
+#include "pdf/timer.h"
+#include "ppapi/c/pp_errors.h"
+#include "ppapi/cpp/logging.h"
+#include "ppapi/cpp/url_request_info.h"
+#include "ppapi/cpp/url_response_info.h"
+
+namespace chrome_pdf {
+
+namespace {
+// We should read with delay to prevent block UI thread, and reduce CPU usage.
+const int kReadDelayMs = 2;
+
+pp::URLRequestInfo MakeRangeRequest(pp::Instance* plugin_instance,
+                                    const std::string& url,
+                                    const std::string& referrer_url,
+                                    uint32_t position,
+                                    uint32_t size) {
+  pp::URLRequestInfo request(plugin_instance);
+  request.SetURL(url);
+  request.SetMethod("GET");
+  request.SetFollowRedirects(false);
+  request.SetCustomReferrerURL(referrer_url);
+
+  // According to rfc2616, byte range specifies position of the first and last
+  // bytes in the requested range inclusively. Therefore we should subtract 1
+  // from the position + size, to get index of the last byte that needs to be
+  // downloaded.
+  std::string str_header =
+      base::StringPrintf("Range: bytes=%d-%d", position, position + size - 1);
+  pp::Var header(str_header.c_str());
+  request.SetHeaders(header);
+
+  return request;
+}
+
+bool GetByteRangeFromStr(const std::string& content_range_str,
+                         int* start,
+                         int* end) {
+  std::string range = content_range_str;
+  if (!base::StartsWith(range, "bytes", base::CompareCase::INSENSITIVE_ASCII))
+    return false;
+
+  range = range.substr(strlen("bytes"));
+  std::string::size_type pos = range.find('-');
+  std::string range_end;
+  if (pos != std::string::npos)
+    range_end = range.substr(pos + 1);
+  base::TrimWhitespaceASCII(range, base::TRIM_LEADING, &range);
+  base::TrimWhitespaceASCII(range_end, base::TRIM_LEADING, &range_end);
+  *start = atoi(range.c_str());
+  *end = atoi(range_end.c_str());
+  return true;
+}
+
+// If the headers have a byte-range response, writes the start and end
+// positions and returns true if at least the start position was parsed.
+// The end position will be set to 0 if it was not found or parsed from the
+// response.
+// Returns false if not even a start position could be parsed.
+bool GetByteRangeFromHeaders(const std::string& headers, int* start, int* end) {
+  net::HttpUtil::HeadersIterator it(headers.begin(), headers.end(), "\n");
+  while (it.GetNext()) {
+    if (base::LowerCaseEqualsASCII(it.name(), "content-range")) {
+      if (GetByteRangeFromStr(it.values().c_str(), start, end))
+        return true;
+    }
+  }
+  return false;
+}
+
+bool IsDoubleEndLineAtEnd(const char* buffer, int size) {
+  if (size < 2)
+    return false;
+
+  if (buffer[size - 1] == '\n' && buffer[size - 2] == '\n')
+    return true;
+
+  if (size < 4)
+    return false;
+
+  return buffer[size - 1] == '\n' && buffer[size - 2] == '\r' &&
+         buffer[size - 3] == '\n' && buffer[size - 4] == '\r';
+}
+
+}  // namespace
+
+class URLLoaderWrapperImpl::ReadStarter : public Timer {
+ public:
+  explicit ReadStarter(URLLoaderWrapperImpl* owner)
+      : Timer(kReadDelayMs), owner_(owner) {}
+  ~ReadStarter() override {}
+
+  // Timer overrides:
+  void OnTimer() override { owner_->ReadResponseBodyImpl(); }
+
+ private:
+  URLLoaderWrapperImpl* owner_;
+};
+
+URLLoaderWrapperImpl::URLLoaderWrapperImpl(pp::Instance* plugin_instance,
+                                           const pp::URLLoader& url_loader)
+    : plugin_instance_(plugin_instance),
+      url_loader_(url_loader),
+      callback_factory_(this) {
+  SetHeadersFromLoader();
+}
+
+URLLoaderWrapperImpl::~URLLoaderWrapperImpl() {
+  Close();
+  // We should call callbacks to prevent memory leaks.
+  // The callbacks don't do anything, because the objects that created the
+  // callbacks have been destroyed.
+  if (!did_open_callback_.IsOptional())
+    did_open_callback_.RunAndClear(-1);
+  if (!did_read_callback_.IsOptional())
+    did_read_callback_.RunAndClear(-1);
+}
+
+int URLLoaderWrapperImpl::GetContentLength() const {
+  return content_length_;
+}
+
+bool URLLoaderWrapperImpl::IsAcceptRangesBytes() const {
+  return accept_ranges_bytes_;
+}
+
+bool URLLoaderWrapperImpl::IsContentEncoded() const {
+  return content_encoded_;
+}
+
+std::string URLLoaderWrapperImpl::GetContentType() const {
+  return content_type_;
+}
+std::string URLLoaderWrapperImpl::GetContentDisposition() const {
+  return content_disposition_;
+}
+
+int URLLoaderWrapperImpl::GetStatusCode() const {
+  return url_loader_.GetResponseInfo().GetStatusCode();
+}
+
+bool URLLoaderWrapperImpl::IsMultipart() const {
+  return is_multipart_;
+}
+
+bool URLLoaderWrapperImpl::GetByteRange(int* start, int* end) const {
+  DCHECK(start);
+  DCHECK(end);
+  *start = byte_range_.start();
+  *end = byte_range_.end();
+  return byte_range_.IsValid();
+}
+
+bool URLLoaderWrapperImpl::GetDownloadProgress(
+    int64_t* bytes_received,
+    int64_t* total_bytes_to_be_received) const {
+  return url_loader_.GetDownloadProgress(bytes_received,
+                                         total_bytes_to_be_received);
+}
+
+void URLLoaderWrapperImpl::Close() {
+  url_loader_.Close();
+  read_starter_.reset();
+}
+
+void URLLoaderWrapperImpl::OpenRange(const std::string& url,
+                                     const std::string& referrer_url,
+                                     uint32_t position,
+                                     uint32_t size,
+                                     const pp::CompletionCallback& cc) {
+  did_open_callback_ = cc;
+  pp::CompletionCallback callback =
+      callback_factory_.NewCallback(&URLLoaderWrapperImpl::DidOpen);
+  int rv = url_loader_.Open(
+      MakeRangeRequest(plugin_instance_, url, referrer_url, position, size),
+      callback);
+  if (rv != PP_OK_COMPLETIONPENDING)
+    callback.Run(rv);
+}
+
+void URLLoaderWrapperImpl::ReadResponseBody(char* buffer,
+                                            int buffer_size,
+                                            const pp::CompletionCallback& cc) {
+  did_read_callback_ = cc;
+  buffer_ = buffer;
+  buffer_size_ = buffer_size;
+  read_starter_ = base::MakeUnique<ReadStarter>(this);
+}
+
+void URLLoaderWrapperImpl::ReadResponseBodyImpl() {
+  read_starter_.reset();
+  pp::CompletionCallback callback =
+      callback_factory_.NewCallback(&URLLoaderWrapperImpl::DidRead);
+  int rv = url_loader_.ReadResponseBody(buffer_, buffer_size_, callback);
+  if (rv != PP_OK_COMPLETIONPENDING) {
+    callback.Run(rv);
+  }
+}
+
+void URLLoaderWrapperImpl::SetResponseHeaders(
+    const std::string& response_headers) {
+  response_headers_ = response_headers;
+  ParseHeaders();
+}
+
+void URLLoaderWrapperImpl::ParseHeaders() {
+  content_length_ = -1;
+  accept_ranges_bytes_ = false;
+  content_encoded_ = false;
+  content_type_.clear();
+  content_disposition_.clear();
+  multipart_boundary_.clear();
+  byte_range_ = gfx::Range::InvalidRange();
+  is_multipart_ = false;
+
+  if (response_headers_.empty())
+    return;
+
+  net::HttpUtil::HeadersIterator it(response_headers_.begin(),
+                                    response_headers_.end(), "\n");
+  while (it.GetNext()) {
+    if (base::LowerCaseEqualsASCII(it.name(), "content-length")) {
+      content_length_ = atoi(it.values().c_str());
+    } else if (base::LowerCaseEqualsASCII(it.name(), "accept-ranges")) {
+      accept_ranges_bytes_ = base::LowerCaseEqualsASCII(it.values(), "bytes");
+    } else if (base::LowerCaseEqualsASCII(it.name(), "content-encoding")) {
+      content_encoded_ = true;
+    } else if (base::LowerCaseEqualsASCII(it.name(), "content-type")) {
+      content_type_ = it.values();
+      size_t semi_colon_pos = content_type_.find(';');
+      if (semi_colon_pos != std::string::npos) {
+        content_type_ = content_type_.substr(0, semi_colon_pos);
+      }
+      base::TrimWhitespaceASCII(content_type_, base::TRIM_ALL, &content_type_);
+      // multipart boundary.
+      std::string type = base::ToLowerASCII(it.values());
+      if (base::StartsWith(type, "multipart/", base::CompareCase::SENSITIVE)) {
+        const char* boundary = strstr(type.c_str(), "boundary=");
+        DCHECK(boundary);
+        if (boundary) {
+          multipart_boundary_ = std::string(boundary + 9);
+          is_multipart_ = !multipart_boundary_.empty();
+        }
+      }
+    } else if (base::LowerCaseEqualsASCII(it.name(), "content-disposition")) {
+      content_disposition_ = it.values();
+    } else if (base::LowerCaseEqualsASCII(it.name(), "content-range")) {
+      int start = 0;
+      int end = 0;
+      if (GetByteRangeFromStr(it.values().c_str(), &start, &end)) {
+        byte_range_ = gfx::Range(start, end);
+      }
+    }
+  }
+}
+
+void URLLoaderWrapperImpl::DidOpen(int32_t result) {
+  SetHeadersFromLoader();
+  did_open_callback_.RunAndClear(result);
+}
+
+void URLLoaderWrapperImpl::DidRead(int32_t result) {
+  if (multi_part_processed_) {
+    // Reset this flag so we look inside the buffer in calls of DidRead for this
+    // response only once.  Note that this code DOES NOT handle multi part
+    // responses with more than one part (we don't issue them at the moment, so
+    // they shouldn't arrive).
+    is_multipart_ = false;
+  }
+  if (result <= 0 || !is_multipart_) {
+    did_read_callback_.RunAndClear(result);
+    return;
+  }
+  if (result <= 2) {
+    // TODO(art-snake): Accumulate data for parse headers.
+    did_read_callback_.RunAndClear(result);
+    return;
+  }
+
+  char* start = buffer_;
+  size_t length = result;
+  multi_part_processed_ = true;
+  for (int i = 2; i < result; ++i) {
+    if (IsDoubleEndLineAtEnd(buffer_, i)) {
+      int start_pos = 0;
+      int end_pos = 0;
+      if (GetByteRangeFromHeaders(std::string(buffer_, i), &start_pos,
+                                  &end_pos)) {
+        byte_range_ = gfx::Range(start_pos, end_pos);
+        start += i;
+        length -= i;
+      }
+      break;
+    }
+  }
+  result = length;
+  if (result == 0) {
+    // Continue receiving.
+    return ReadResponseBodyImpl();
+  }
+  DCHECK(result > 0);
+  memmove(buffer_, start, result);
+
+  did_read_callback_.RunAndClear(result);
+}
+
+void URLLoaderWrapperImpl::SetHeadersFromLoader() {
+  pp::URLResponseInfo response = url_loader_.GetResponseInfo();
+  pp::Var headers_var = response.GetHeaders();
+
+  SetResponseHeaders(headers_var.is_string() ? headers_var.AsString() : "");
+}
+
+}  // namespace chrome_pdf