The WolfspyreLabs Blog / 2023 / Posts from July / More fun with Ceph RADOSGW / Using RadosGW for website hosting? / Using RadosGW for website hosting? Hosting a web site in radosgw Posted on Tue 26 January 2016 in hints-and-kinks âą 6 min read If youâre familiar with web site hosting on Amazon S3, which is a simple and cheap way to host a static web site, you might be wondering whether or not you can do the same in Ceph radosgw. The short answer is you canât. Bucket Website is listed as Not Supported in the radosgw S3 API support matrix, and radosgw doesnât have index document support either. But the longer answer is that you can, provided you use radosgw in combination with a front-end load-balancer â which, as it happens, can add a few more bells and whistles, as well. You could probably do the same thing with nginx, Varnish, or Apache in a mod_proxy_balancer balancer setup, but in this example configuration, weâll use HAProxy. Getting started: the radosgw basics Letâs take look at a simple radosgw configuration with virtual host support, such that you can access your buckets as either http://ceph.example.com/bucketname or http://bucketname.ceph.example.com: [client.rgw.radosgw01] rgw_frontends = civetweb port=7480 rgw_dns_name = ceph.example.com rgw_resolve_cname = True Suppose we use s3cmd to upload an HTML file to this bucket, setting a public ACL: s3cmd mb s3://testwebsite s3cmd put –acl-public index.html s3://testwebsite/ Then if you exposed your radosgw to the web, any client (without authentication) would be able to retrieve http://testwebsite.ceph.example.com:7480/index.html with a web browser, or any other HTTP client application (such as curl or wget): curl -I http://testwebsite.ceph.example.com:7480/index.html Which would then return something like: HTTP/1.1 200 OK Content-Length: 18050 Accept-Ranges: bytes Last-Modified: Mon, 25 Jan 2016 21:28:47 GMT ETag: “b03130a4a1fc24df0f9f336f2b6d1d90” x-amz-request-id: tx000000000000000005a88-0056a7b7eb-312df-default Content-type: text/html Date: Tue, 26 Jan 2016 18:16:11 GMT Introducing HAProxy Now letâs start out with putting HAproxy in between. Nothing special there: radosgw listens on the conventional 7480 port, and we simply hand HAproxy traffic through there, and bind HAProxy itself to port 80. global log /dev/log local0 pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon turn on stats unix socket #stats socket /var/lib/haproxy/stats level admin Default SSL material locations #ca-base /etc/ssl/certs crt-base /etc/haproxy/ssl Default ciphers to use on SSL-enabled listening sockets. #For more information, see ciphers(1SSL). #ssl-default-bind-ciphers HIGH tune.ssl.default-dh-param 2048 defaults log global mode http option httplog option dontlognull retries 3 timeout queue 1000 timeout connect 1000 timeout client 30000 timeout server 30000 option forwardfor frontend ceph_front bind 0.0.0.0:80 default_backend ceph_back backend ceph_back balance source server radosgw01 127.0.0.1:7480 check Index documents So, the first thing weâll need to add is support for index documents. Weâd like to make sure that when we retrieve https://testwebsite.ceph.example.com/, whatâs actually fetched from the backend is /index.html. We can do that by adding an HAproxy ACL that matches for the trailing slash in the path, and an http-request set-path directive that appends the index document name: frontend ceph_front bind 0.0.0.0:80 acl path_ends_in_slash path_end -i / Append index document (index.html) to any path #ending in “/”. #http-request set-path %[path]index.html if path_ends_in_slash default_backend ceph_back Now, thatâs fine in terms of getting the index document correctly: curl -I http://testwebsite.ceph.example.com/index.html HTTP/1.1 200 OK Content-Length: 18050 Accept-Ranges: bytes Last-Modified: Mon, 25 Jan 2016 21:28:47 GMT ETag: “b03130a4a1fc24df0f9f336f2b6d1d90” x-amz-request-id: tx000000000000000005a94-0056a7b9e3-312df-default Content-type: text/html Date: Tue, 26 Jan 2016 18:24:35 GMT However, it of course breaks uploads and even bucket listings, or in other words, anything that uses the S3 API. Now you could test for some S3-specific headers in the request, but really, you should just check whether the request is authorized, and only apply the index document logic if it isnât, like so: frontend ceph_front bind 0.0.0.0:80 acl path_ends_in_slash path_end -i / acl auth_header hdr(Authorization) -m found Append index document (index.html) to any path #ending in “/”, unless the request has an auth header #http-request set-path %[path]index.html if path_ends_in_slash !auth_header default_backend ceph_back Great. Now we can upload using full paths without mangling, and on any un-authenticated requests, we substitute /index.html for any trailing /. In case youâre wondering: yes, this works for any path, not just the root path. Directory paths However, you may also want something else, which is the ability to correctly handle a request like http://testwebsite.ceph.example.com/my/sub/directory, where of course you want the path /my/sub/directory translated into /my/sub/directory/index.html, which means we want to append a slash and an index document name to the request path. So letâs do that: frontend ceph_front bind 0.0.0.0:80 acl path_has_dot path_sub -i . acl path_ends_in_slash path_end -i / acl auth_header hdr(Authorization) -m found http-request set-path %[path]index.html if path_ends_in_slash !auth_header Append trailing slash if necessary. #http-request set-path %[path]/index.html if !path_has_dot !path_ends_in_slash !auth_header default_backend ceph_back Note that what weâre doing here is somewhat crude. Weâre assuming that any actual file that we want to retrieve looks like name.ext, meaning it has a dot (period, full stop) character in it. The path_sub -i . expression in the path_has_dot ACL simply matches any path with . in it, and weâre assuming that if a path has a dot then it points to a file, if it doesnât then it points to a directory. You could be a little more clever here and use path_regex instead of path_sub for a full regular expression match. But regex lookups are slower than simple substring matches, so if the substring match works for you, go for it. So now, we can do this: s3cmd put –acl-public index.html s3://testwebsite/my/sub/directory/ And then: Note omitted trailing slash #curl -I http://testwebsite.ceph.example.com/my/sub/directory HTTP/1.1 200 OK Content-Length: 24235 Accept-Ranges: bytes Last-Modified: Mon, 25 Jan 2016 23:57:04 GMT ETag: “fecd005b33c0f6bfdee61b787cf54cb0” x-amz-request-id: tx00000000000000000bc83-0056a7bd25-312cd-default Content-type: text/html Date: Tue, 26 Jan 2016 18:38:29 GMT HTTPS support So, what else might you want to do? One obvious thing that you can use HAproxy for is SSL termination. The radosgw embedded civetweb webserver can do that for you, but that feature is currently mildly broken in a rather curious way. So in order to allow HTTPS access to all your content via HAproxy instead, you would add: frontend ceph_front_ssl bind 0.0.0.0:443 ssl crt ceph.pem no-sslv3 no-tls-tickets reqadd X-Forwarded-Proto:\ https acl path_has_dot path_sub -i . acl path_ends_in_slash path_end -i / acl auth_header hdr(Authorization) -m found http-request set-path %[path]index.html if path_ends_in_slash !auth_header http-request set-path %[path]/index.html if !path_has_dot !path_ends_in_slash !auth_header default_backend ceph_back But maybe youâd like to force, not merely allow, HTTPS access. redirect to the rescue: frontend ceph_front bind 0.0.0.0:80 reqadd X-Forwarded-Proto:\ http redirect scheme https code 301 if !{ ssl_fc } frontend ceph_front_ssl bind 0.0.0.0:443 ssl crt ceph.pem no-sslv3 no-tls-tickets reqadd X-Forwarded-Proto:\ https acl path_has_dot path_sub -i . acl path_ends_in_slash path_end -i / acl auth_header hdr(Authorization) -m found http-request set-path %[path]index.html if path_ends_in_slash !auth_header http-request set-path %[path]/index.html if !path_has_dot !path_ends_in_slash !auth_header default_backend ceph_back And here we go: Note HTTP #curl -IL http://testwebsite.ceph.example.com/my/sub/directory HTTP/1.1 301 Moved Permanently Content-length: 0 Location: https://testwebsite.ceph.example.com/my/sub/directory Connection: close HTTP/1.1 200 OK Content-Length: 24235 Accept-Ranges: bytes Last-Modified: Mon, 25 Jan 2016 23:57:04 GMT ETag: “fecd005b33c0f6bfdee61b787cf54cb0” x-amz-request-id: tx00000000000000000bdeb-0056a7bf9b-312cd-default Content-type: text/html Date: Tue, 26 Jan 2016 18:48:59 GMT Compression And finally, maybe youâd like to speed up access to the stuff on your site. Why not add gzip on-the-fly-compression? Itâs supported by every browser worth its salt, and will make your users happier. Youâll want to restrict compression to specific MIME types though. In the configuration below, we enable compression for plain text, HTML, XML, CSS, JavaScript, and SVG images. frontend ceph_front bind 0.0.0.0:80 reqadd X-Forwarded-Proto:\ http redirect scheme https code 301 if !{ ssl_fc } frontend ceph_front_ssl bind 0.0.0.0:443 ssl crt ceph.pem no-sslv3 no-tls-tickets reqadd X-Forwarded-Proto:\ https acl path_has_dot path_sub -i . acl path_ends_in_slash path_end -i / acl auth_header hdr(Authorization) -m found http-request set-path %[path]index.html if path_ends_in_slash !auth_header http-request set-path %[path]/index.html if !path_has_dot !path_ends_in_slash !auth_header compression algo gzip compression type text/html text/xml text/plain text/css application/javascript image/svg+xml default_backend ceph_back Letâs see how that helps us. Do a request without gzip encoding support, and observe that its total download size matches the documentâs Content-Length: curl https://testwebsite.ceph.example.com/my/sub/directory > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 24235 100 24235 0 0 94565 0 –:–:– –:–:– –:–:– 94299 Now, add an Accept-Encoding header: curl -H ‘Accept-Encoding: gzip’ https://testwebsite.ceph.example.com/my/sub/directory > /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 5237 0 5237 0 0 19243 0 –:–:– –:–:– –:–:– 19324 There. Actual download size goes from 24KB down to just 5KB. Where to go from here Thereâs a few additional features to be added here. You could enable CORS or HSTS, for example, and of course you could add more backends. But if you read this far, you surely get the idea. And youâre welcome to examine the headers you can pull from this page youâre reading, wink wink. :) This article originally appeared on the hastexo.com website (now defunct). Ceph