Skip to content
This repository was archived by the owner on Apr 10, 2025. It is now read-only.

PageSpeed output resources cannot be cached in Google Cloud #1149

Closed
JoyceBabu opened this issue Mar 11, 2016 · 18 comments
Closed

PageSpeed output resources cannot be cached in Google Cloud #1149

JoyceBabu opened this issue Mar 11, 2016 · 18 comments

Comments

@JoyceBabu
Copy link

Currently pagespeed adds Cache-Control:max-age=31536000 to optimized resources. Is it possible to add public to it?

Google CDN (alpha) requires 'Cache-Control: public' to be present for resources that must be cached.

@jmarantz
Copy link
Contributor

Hi -- I'm getting some clarification from some people that know more about
the CDN.

On Fri, Mar 11, 2016 at 6:35 AM, Joyce Babu [email protected]
wrote:

Currently pagespeed adds Cache-Control:max-age=31536000 to optimized
resources. Is it possible to add public to it?

Google CDN (alpha) requires 'Cache-Control: public' to be present for
resources that must be cached.


Reply to this email directly or view it on GitHub
#1149.

@jmarantz
Copy link
Contributor

Adding "public" to the cache-control value is a requirement for Google CDN, but is not required by the HTTP spec. https://cloud.google.com/compute/docs/load-balancing/http/cdn describes the Google requirements, which are pretty clear.

To meet these without adding an unnecessary qualifier for other caches, we can gate this logic on the presence of response-header
Via: 1.1 google

@jmarantz
Copy link
Contributor

Joyce: is it possible for you to add a workaround for this problem in your nginx config? That might help you get better behavior until we can add this functionality to a PageSpeed release.

@JoyceBabu
Copy link
Author

I tried disabling all rewriting rules, and using add_header 'Cache-Control' public;. This resulted in two Cache-Control headers

Cache-Control:max-age=604800
Cache-Control: public

Google CDN did not cache those resources, even though it is technically equivalent to Cache-Control:max-age=604800, public

To meet these without adding an unnecessary qualifier for other caches, we can gate this logic on the presence of response-header

Did you mean request-header?

Joyce: is it possible for you to add a workaround for this problem in your nginx config? That might help you get better behavior until we can add this functionality to a PageSpeed release.

I am unable to modify the Cache-Control headers from nginx config. I tried using POSITION_AUX=true, still I am unable to modify the headers. I remember seeing a comment by Jeff in another issue that nginx location blocks targeting optimized resources doesn't work when LoadFromFile is enabled.

@jmarantz
Copy link
Contributor

Yes I meant the Via request-header :)

there is not an nginx equivalent to apache's "header set" directive?

On Sat, Mar 12, 2016 at 3:37 AM, Joyce Babu [email protected]
wrote:

I tried disabling all rewriting rules, and using add_header
'Cache-Control' public;. This resulted in two Cache-Control headers

Cache-Control:max-age=604800
Cache-Control: public

Google CDN did not cache those resources, even though it is technically
equivalent to Cache-Control:max-age=604800, public

To meet these without adding an unnecessary qualifier for other caches, we
can gate this logic on the presence of response-header

Did you mean request-header?

Joyce: is it possible for you to add a workaround for this problem in your
nginx config? That might help you get better behavior until we can add this
functionality to a PageSpeed release.

I am unable to modify the Cache-Control headers from nginx config. I tried
using POSITION_AUX=true, still I am unable to modify the headers. I
remember seeing a comment by Jeff in another issue that nginx location
blocks don't when LoadFromFile is enabled.


Reply to this email directly or view it on GitHub
#1149 (comment)
.

@JoyceBabu
Copy link
Author

there is not an nginx equivalent to apache's "header set" directive?

Yes, there is the add_header directive and the more_set_headers directive. But from what I understand, the header filter appears before ngx_pagespeed in the nginx filter chain. Hence, headers are always intercepted and modified by ngx_pagespeed.

Setting the (experimental) environment variable POSITION_AUX=true is supposed to move ngx_pagespeed's position, so that it will be possible to modify the headers from nginx config. I just found out that it was not working because a change introduced for nginx 1.9.11+ had broken that option. After fixing it, now I am able to do more_set_headers "Cache-Control: max-age=31536000, public";

Thank you.

@jmarantz
Copy link
Contributor

Glad you found a workaround, but I want to leave this open till we fix it properly.

@jmarantz
Copy link
Contributor

This auto-closed when I subbmitted the partial fix: apache/incubator-pagespeed-mod@6d81e80

This fix propagates cache-control:public from origin resources to rewritten resources. However, it does not infer that we need cache-control:public when running inside GCE, which it can do with the "Via" request header, if that includes the substring 'google'.

jmarantz added a commit to apache/incubator-pagespeed-mod that referenced this issue Jun 27, 2016
via:*google*.

Completes the fix to
apache/incubator-pagespeed-ngx#1149

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile
also we must make sure we don't cache the 'public' based on the request
headers.
@dhirenpratap
Copy link

It gonna help me as ..as google cdn requires public header.

@jmarantz
Copy link
Contributor

You can get this today if you build in trunk, or you can wait for a release that includes this bug fix, but I don't have a date for you yet.

@dhirenpratap
Copy link

Can wait for it. Thanks for the hard work.

crowell pushed a commit to apache/incubator-pagespeed-mod that referenced this issue Jul 26, 2016
Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

Propagate 'cache-control:public' from inputs to outputs.

Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

fixup

fixup for c++03

more c++03isms

rm unused include
crowell pushed a commit to apache/incubator-pagespeed-mod that referenced this issue Jul 26, 2016
Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

Propagate 'cache-control:public' from inputs to outputs.

Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

fixup

fixup for c++03

more c++03isms

rm unused include

rm override (no c++11 on branch 33)

Also add 'public' to non-private cache-control if request has
via:*google*.

Completes the fix to
apache/incubator-pagespeed-ngx#1149

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile
also we must make sure we don't cache the 'public' based on the request
headers.
crowell pushed a commit to apache/incubator-pagespeed-mod that referenced this issue Jul 26, 2016
Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

Propagate 'cache-control:public' from inputs to outputs.

Partially fixes apache/incubator-pagespeed-ngx#1149

General strategy:
 - adjust general mechanism for computing output response header from input response headers
   to incorporate 'public' on input.

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile

fixup

fixup for c++03

more c++03isms

rm unused include

rm override (no c++11 on branch 33)

Also add 'public' to non-private cache-control if request has
via:*google*.

Completes the fix to
apache/incubator-pagespeed-ngx#1149

A challenge here is to make sure we test all the egress points, e.g.
  - pagespeed resources: cached and reconstructed on demand
  - fallbacks of various sorts
  - ipro resources: cached and reconstructed on demand
  - loaded from LoadFromFile
also we must make sure we don't cache the 'public' based on the request
headers.

Add a method to add 'public' to response-headers if via:*google is
present in request-headers.

c++03
@crowell crowell changed the title Set Cache-Control: public on pagespeed optimized resources Set Cache-Control: public on pagespeed optimized resources when running in Google cloud Aug 2, 2016
@justbeez
Copy link

justbeez commented Aug 5, 2016

Thanks for the effort on this @jmarantz @crowell!

We've been running into the same issues with Cloud CDN on the mod_pagespeed side and trying to come up with workaround based on where this hooks in our chain—and I'm glad to have found this issue over on the ngx_pagespeed tracker.

I see that about a month and a half ago that there wasn't a date for this to roll out yet—does anyone know if there's a window scheduled for an upcoming release? I haven't seen one in quite a while, so I'm mostly just trying to figure out if it's worth building trunk or if my laziness will pay off ;)

Thanks again all!

@crowell
Copy link
Contributor

crowell commented Aug 9, 2016

We're working on getting the release rolled out this week. We'll make an announcement on the mailing lists when it is available.

@crowell crowell changed the title Set Cache-Control: public on pagespeed optimized resources when running in Google cloud PageSpeed output resources cannot be cached in Google Cloud Aug 12, 2016
@olekstumedia
Copy link

Any update on this? We really need that new version of ngx_pagespeed module((
GoogleCDN will not work for the all servers that use pagespeed))) This is contradictorily assuming that pagespeed is also related/mainteined by Google company.

@morlovich
Copy link
Contributor

New version with a fix just got released earlier today:
https://groups.google.com/forum/#!topic/ngx-pagespeed-discuss/flIW0eAdVFw

@justbeez
Copy link

Just a note for those looking for this on the mod_pagespeed site, this is available in the 1.11.33.3-beta version announced here:
https://groups.google.com/forum/#!topic/mod-pagespeed-announce/UMCmeItofcY

@nathanjosiah
Copy link

Any update on this issue? I see that a few versions have been released with a fix for this for both Apache and nginx but I am still unable to change the Cache-Control header for HTML responses. The only resource I can find about the Cache-Control header still being set is on the Configuring Downstream Caches page but this won't work for the Google Cloud CDN because the CDN doesn't allow the required configuration such as the PS-Capability header or the purge URL's.

@jmarantz
Copy link
Contributor

This was fixed a long time ago in apache/incubator-pagespeed-mod@c8dc73f and was released in 1.11, which is marked as stable.

However, if you are talking about HTML responses, then this bug is not related to that at all. Note that caching HTML responses with ngx_pagespeed means you cannot enable any user-agent specific optimizations (e.g. webp transcoding). If that's acceptable, configure ModifyCachingHeaders ( https://modpagespeed.com/doc/configuration#ModifyCachingHeaders) and then NPS will just leave your headers alone.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants