Sponsored Links
-->

Tuesday, August 14, 2018

Deep Dive on S3 Storage Management Covering New Feature ...
src: i.ytimg.com

Amazon S3 (Simple Storage Service) is a cloud computing web service offered by Amazon Web Services (AWS). Amazon S3 provides object storage through web services interfaces (REST, SOAP, and BitTorrent). Amazon launched S3 as its fifth publicly-available web service and in November 2007 in Europe.

Amazon says that S3 uses the same scalable storage infrastructure that Amazon.com uses to run its own global e-commerce network.

Amazon S3 is reported to store more than 2 trillion objects as of April 2013. This is up from 10 billion objects as of October 2007, 14 billion objects in January 2008, 29 billion objects in October 2008, 52 billion objects in March 2009, 64 billion objects in August 2009, and 102 billion objects in March 2010. S3 uses include web hosting, image hosting, and storage for backup systems. S3 guarantees 99.9% monthly up-time service-level agreement (SLA), that is, not more than 43 minutes of downtime per month.

Amazon S3 is one of the earliest and key drivers (along with EC2) of AWS, the most profitable division under the entire Amazon company.


Video Amazon S3



Design

Amazon does not make details of S3's design public, though it manages data with an object-storage architecture. According to Amazon, S3's design aims to provide scalability, high availability, and low latency at commodity costs.

S3 is designed to provide between 99.9% durability and 99.99%-99.95% availability of objects over a given year depending on which Amazon S3 Storage classes are used, though there is no service-level agreement for durability.

The basic storage units of Amazon S3 are objects which are organized into buckets (each owned by an Amazon Web Services account) and identified within each bucket by a unique, user-assigned key. Amazon Machine Images (AMIs) which are used in the Elastic Compute Cloud (EC2) can be exported to S3 as bundles.

Buckets and objects can be created, listed, and retrieved using either a REST-style HTTP interface or a SOAP interface, but new Amazon S3 features will not be supported for SOAP, Amazon recommends using either the REST API or the AWS SDKs.


https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html

Additionally, objects can be downloaded using the HTTP GET interface and the BitTorrent protocol.

S3 stores arbitrary objects, for example computer files, up to 5 terabytes in size since 2011, each accompanied by up to 2 kilobytes of metadata.

Requests are authorized using an access control list associated with each bucket and object and support versioning, disabled by default, and lifecycle management of objects.

Bucket names and keys are chosen so that objects are addressable using HTTP URLs:

  • http://s3.amazonaws.com/bucket/key
  • http://bucket.s3.amazonaws.com/key
  • http://bucket/key (where bucket is a DNS CNAME record pointing to bucket.s3.amazonaws.com)

Because objects are accessible by unmodified HTTP clients, S3 can be used to replace significant existing (static) web-hosting infrastructure. The Amazon AWS Authentication mechanism allows the bucket owner to create an authenticated URL with time-bounded validity. That is, someone can construct a URL that can be handed off to a third-party for access for a period such as 30 minutes, or 24 hours.

Every item in a bucket can also be served as a BitTorrent feed. The S3 store can act as a seed host for a torrent and any BitTorrent client can retrieve the file. This drastically reduces the bandwidth costs for the download of popular objects. While the use of BitTorrent does reduce bandwidth, AWS does not provide native bandwidth limiting and as such users have no access to automated cost control. This can lead to users on the "free-tier" S3 or small hobby users amassing dramatic bills. AWS representatives have previously stated that such a feature was on the design table from 2006 to 2010 but in 2011 stated the feature is no longer in development.

A bucket can be configured to save HTTP log information to a sibling bucket; this can be used in later data mining operations.


Maps Amazon S3



Amazon S3 Storage Classes

Amazon S3 offers four different storage classes designed for different use case depending on durability, availability and performance requirements. Amazon S3 Standard and Reduced Redundancy Storage (RRS) , Amazon S3 Standard Infrequent Access (IA), Amazon S3 One Zone Infrequent Access and Amazon Glacier which is designed for data archiving.

  • Amazon S3 Standard is the default class.
  • Amazon S3 Standard Infrequent Access (IA) is used for less frequently accessed data. Typical use cases are backups and disaster recovery solutions. Costs are lower than the Amazon S3 Standard, but applies additional fees per gigabyte of data retrieved.
  • Amazon S3 Reduced Redundancy Storage (RRS) is designed for noncritical, reproducible data at lower levels of redundancy. It reduces costs storing data in a less fault-tolerance manner. It supports one facility fault instead of two, unlike the Amazon S3 Standard. Typical use cases can be data that could be recreated in the case of data loss. Durability is claimed to be 99.99% in comparison with 99.999999999% of standard class.
  • Amazon Glacier is designed for long-term storage of data that is infrequently accessed and for which retrieval latency of minutes or hours are acceptable. Use cases for this may be as a status service, where other servers may not need to be checked so frequently.

AWS Marketplace: Informatica Intelligent Cloud Services for Amazon ...
src: d7umqicpi7263.cloudfront.net


Pricing

Amazon S3 pricing varies depending on the different S3 storage classes. Prices vary from storage usage, number of requests, and data transfers. At its inception, Amazon charged end users $0.15 per gigabyte-month, with additional charges for bandwidth used in sending and receiving data, and a per-request (get or put) charge. On November 1, 2008, pricing moved to tiers where end users storing more than 50 terabytes receive discounted pricing. As of July 2018, the price for the first 50 Tb ranges from $0.023 to $0.0405 per gigabyte per month, depending on the choice of location for storage.


Storage with Amazon S3 and Amazon Glacier - YouTube
src: i.ytimg.com


Hosting entire websites

Amazon S3 provides the option to host static websites with index document support and error document support. This support was added as a result of user requests dating at least to 2006. For example, suppose that Amazon S3 was configured with CNAME records to host "http://subdomain.example.com/". In the past, a visitor to this URL would find only an XML-formatted list of objects instead of a general landing page (e.g., index.html) to accommodate casual visitors. However, websites now hosted on S3 may designate a default page to display, and another page to display in the event of a partially invalid URL, such as a 404 error.


Cloud, Big Data and Mobile: Part 1: Log archival with Amazon S3 ...
src: 1.bp.blogspot.com


Notable users

Photo hosting service SmugMug has used S3 since April 2006. They experienced a number of initial outages and slowdowns, but after one year they described it as being "considerably more reliable than our own internal storage" and claimed to have saved almost $1 million in storage costs.

There are various User Mode File System (FUSE)-based file systems for Unix-like operating systems (Linux, etc.) that can be used to mount an S3 bucket as a file system. Note that as the semantics of the S3 file system are not that of a POSIX file system, so the file system may not behave entirely as expected.

  • Apache Hadoop file systems can be hosted on S3, as its requirements of a file system are partially met by S3. As a result, Hadoop can be used to run MapReduce algorithms on EC2 servers, reading data and writing results back to S3.
  • Netflix uses Amazon Web Services for their storage and compute operations with S3 being their system of record. Netflix implemented a tool, S3mper, to address the limitations of eventual consistency that Amazon S3 provides. S3mper stores the filesystem metadata: filenames, directory structure and permissions in Amazon DynamoDB.
  • reddit is hosted on S3.
  • Dropbox, Bitcasa, and Tahoe-LAFS-on-S3, among others, use S3 for online backup and synchronization services. In 2016, Dropbox moved out from using Amazon S3 services and developed its own cloud server.
  • Mojang hosts Minecraft game updates and player skins on S3.
  • Tumblr, Formspring, and Pinterest host images on S3.
  • Swiftype's CEO has mentioned that the company uses S3.
  • S3 was used in the past by some enterprises as a long term archiving solution, until Amazon Glacier was released in August 2012.
  • The API has become a popular method for object storage. As a result, more and more applications have been built to natively support the S3 API. This includes applications that write data to AWS S3, as well as to S3-compatible object stores:

Introduction to Amazon S3 - YouTube
src: i.ytimg.com


Amazon S3 logs

Amazon S3 allows users to enable or disable logging. If enabled, the logs are stored on Amazon S3 buckets which can then be analyzed. These logs contain useful information such as:

  • Date / time of access to requested content
  • Protocol used (HTTP, FTP, etc.)
  • HTTP Status Code
  • Turnaround time
  • HTTP Request

These logs can be analyzed and managed by using third-party tools such as S3Stat, Cloudlytics, Qloudstat, AWStats or Splunk.


Amazon S3 รข€
src: www.cloudbacko.com


S3 API and competing services

S3 API allows operations on different components of Amazon S3 solution such as Buckets, Objects, and the Service.

The broad adoption of Amazon S3 and related tooling has given rise to competing services based on the S3 API. These services use the standard programming interface; however, they are differentiated by their underlying technologies and supporting business models. A cloud storage standard (like electrical and networking standards) enables competing service providers to design their services and clients using different parts in different ways yet still communicate and provide the following benefits:

  1. Increase competition by providing a set of rules and a level playing field, encouraging market entry by smaller companies which might otherwise be precluded.
  2. Encourage innovation by cloud storage & tool vendors, & developers because they can focus on improving their own products and services instead of focusing on compatibility.
  3. Allow economies of scale in implementation (i.e., if a service provider encounters an outage or as clients outgrow their tools and need faster operating systems or tools, they can easily swap out solutions).
  4. Provide timely solutions for delivering functionality in response to demands of the marketplace (i.e., as business growth in new locations increases demand, clients can easily change or add service providers simply by subscribing to the new service).

Examples of competing S3 compliant storage implementations:

  • Microsoft Azure's BLOB storage.
  • Ceph with RADOS gateway.
  • Cloudian HyperStore
  • Apache CloudStack
  • Connectria's Cloud Storage
  • DDN Web Object Scaler (WOS) for on-premises Cloud storage
  • DELL EMC Elastic Cloud Storage (ECS)
  • DigitalOcean Spaces
  • Eucalyptus
  • Google Cloud Storage
  • IBM Bluemix object storage
  • IBM Cloud Object Storage (formerly Cleversafe) for on-premise Object Storage as well as the IBM Public Cloud
  • Minio Object Storage
  • NooBaa Hybrid Storage
  • Nimbula (acquired by Oracle)
  • OpenIO
  • Openstack Swift
  • Pure Storage's FlashBlade
  • NetApp StorageGRID for on-premise Clouds
  • Rackspace's Cloud Files
  • Riak CS, which implements a subset of the S3 API including REST and ACLs on objects and buckets.
  • Scality RING

Transforming Data Lakes with Amazon S3 Select & Amazon Glacier ...
src: i.ytimg.com


Amazon S3 tools

Amazon S3 provides an API for third-party developers. It describes various API operations, related request and response structures, and error codes. The original AWS Console provides tools for managing and uploading files, but it is not capable of managing large buckets or editing files online. Third party websites like S3edit.com or Cloudberry Explorer software can help edit files on S3.


Avere - Hybrid Cloud NAS and AWS
src: www.averesystems.com


History

Amazon introduced S3 in 2006


Uploading a File to Amazon Web Services (AWS) S3 Bucket with ...
src: i.ytimg.com


See also

  • Amazon Elastic Block Storage (EBS)

Logo Amazon.com Amazon Web Services Amazon S3 - world wide web png ...
src: banner2.kisspng.com


Notes


How To Use Amazon S3 as the File Storage System of Your MFT Server ...
src: i.ytimg.com


References


Integrating Teradata with Amazon Redshift Using the AWS Schema ...
src: d2908q01vomqb2.cloudfront.net


External links

  • Official website

Source of article : Wikipedia