You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
104 lines
4.8 KiB
104 lines
4.8 KiB
6 years ago
|
# AWSome Day Notes: Part 1: The Basics
|
||
|
|
||
|
Following are some notes from Amazon's AWSome Day (Tuesday, February 27, 2018).
|
||
|
|
||
|
## EC2 Costs and Scheduling
|
||
|
|
||
|
Cost of a node:
|
||
|
* Important to understand Amazon's price model: users pay for *access*, not for *hardware*
|
||
|
* Cost of AWS node is cost for *on the spot access*
|
||
|
|
||
|
Scheduling:
|
||
|
* If you can anticipate your usage, you can schedule instances in advance, and get a discount
|
||
|
* Discount of 50% for one-year reservation (if you keep it busy for 6 months, you've made your money back)
|
||
|
* Spot instances also available - need to be robust to sudden starts/stops (good for embarrassingly parallel jobs)
|
||
|
* Cheaper to anticipate your usage and plan ahead
|
||
|
|
||
|
## EC2 Transfer Costs
|
||
|
|
||
|
EC2 Instances:
|
||
|
* See [EC2 Instance Pricing - Data Transfer](https://aws.amazon.com/ec2/pricing/on-demand/) section
|
||
|
* Network costs for AWS nodes are an important consideration for high-traffic nodes (>10 TB)
|
||
|
|
||
|
EC2-Internet:
|
||
|
* Traffic going from the internet *into* a node is always free
|
||
|
* Traffic going from the node *out* to the internet incurrs costs after 10 TB
|
||
|
* Outbound traffic costs ~$90/TB
|
||
|
|
||
|
AWS Regions:
|
||
|
* Traffic *within* a region does not incur costs (well... it's complicated)
|
||
|
* Traffic *between* regions will incur costs
|
||
|
|
||
|
EC2-S3:
|
||
|
* Transfer *into* an EC2 node from S3 bucket in same AWS region does not incur costs
|
||
|
* Transfer *out of* an EC2 node into S3 bucket in same AWS region does not incur costs
|
||
|
* (If they did charge you, they would be double-dipping...)
|
||
|
|
||
|
Note: the list of prices is like a legal document, so use the [AWS Monthly Calculator](https://calculator.s3.amazonaws.com/index.html) to estimate monthly costs with more detail.
|
||
|
|
||
|
## S3 Transfer Costs
|
||
|
|
||
|
* See [S3 Pricing - Data Transfer](https://aws.amazon.com/s3/pricing/)
|
||
|
* Price model for storage is simliar to price model for AWS nodes: you pay for *access*, not for *hardware*
|
||
|
* To give a sense of why, think about logistics of a large "disk farm": all the intensive operations are done by the head nodes, disks are just passive
|
||
|
* Busier disk farm needs sophisticated hardware for parallel read/write, high-bandwidth network lines, fast encryption
|
||
|
|
||
|
S3 storage pricing:
|
||
|
* Rule of thumb: ~$20/TB to store the data
|
||
|
|
||
|
S3-Internet:
|
||
|
* Transfer *into* an S3 bucket from the internet is always free (getting stuff into the bucket is the easy part - that's how they get ya)
|
||
|
* Transfer *out* of an S3 bucket to the internet costs ~$90/TB
|
||
|
|
||
|
S3-EC2:
|
||
|
* Transfer *out* of an S3 bucket to most other Amazon regions costs ~$20/TB
|
||
|
* Transfer *out* of an S3 bucket into an EC2 node in the same AWS region does not incur costs
|
||
|
* Transfer *into* an S3 bucket from an EC2 node in the same AWS region does not incur costs
|
||
|
|
||
|
As mentioned above, this means you won't be double-charged for transferring data from an S3 bucket to an EC2 node, then from the EC2 node out to the internet.
|
||
|
|
||
|
## S3 Storage Hierarchies
|
||
|
|
||
|
Continuing with the theme of planning ahead...
|
||
|
|
||
|
Storage hierarchies:
|
||
|
* Biggest cost of storage is not disk space, it's transfer
|
||
|
* Paying for speed, paying for timeliness, paying for *on the spot access* to your data
|
||
|
* Your data will be cheaper if you're willing to wait a few minutes or deal with a slow connection
|
||
|
|
||
|
Storage hierarchies:
|
||
|
* Standard (~$20/TB)
|
||
|
* Infrequent access (~$13/TB) - less frequent access, but at same transfer speed
|
||
|
* Glacier (~$4/TB) - delay of up to 12 hours (smaller files = faster), deleting data *newer* than 3 months incurrs costs
|
||
|
|
||
|
[Glacier Pricing](https://aws.amazon.com/glacier/pricing/)
|
||
|
|
||
|
Lifecycle rules:
|
||
|
* Can create rules to move old data from S3 buckets into Glacier
|
||
|
|
||
|
## EFS vs EBS vs S3
|
||
|
|
||
|
When do you use EFS, EBS, or S3?
|
||
|
|
||
|
Elastic Block Storage (EBS):
|
||
|
* **This is probably what you want**
|
||
|
* EBS is block storage for one EC2 node - designed for general purpose applications
|
||
|
* Cost: ~$120/TB/mo
|
||
|
|
||
|
Elastic File System (EFS):
|
||
|
* EFS is block storage for multiple EC2 nodes - designed for fast read-write operations, many incremental changes to files
|
||
|
* "Elastic" part of EFS - can dynamically grow as hard drive grows (PB+ scale)
|
||
|
* Hard drive on steroids - like plugging in a hard drive over a network, but big/fast/smart enough to be accessible to thousands+ of machines
|
||
|
* Expensive: ~$300/TB/mo
|
||
|
|
||
|
S3:
|
||
|
* S3 is object storage - it stores blobs of raw data, creates snapshots in time
|
||
|
* If you change a single character of a large file, bucket has to create new shapshot
|
||
|
* Booting from S3 as a hard disk would take you about a thousand years... don't do that
|
||
|
* Cheapest: ~$20/TB
|
||
|
|
||
|
Cool but $$$:
|
||
|
* You may see "appliances" mentioned in Amazon documentation - Amazon will ship you a physical data transfer appliance that encrypts and copies data on site ([Snowball](https://docs.aws.amazon.com/snowball/latest/ug/images/Snowball-closed-600w.png))
|
||
|
* Can also purchase special network connections that bypass the public internet - like ISP putting alligator clips between your network lines and Amazon's network lines
|
||
|
|