Lab complete!
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool
In the previous step, you provided AWS Cost & Usage Report data in an Amazon S3 bucket. In this step you will make this usage data available in the AWS Glue data catalog for Amazon Athena. Amazon Athena allows you to run SQL (Structured Query Language) queries on the data without loading it into a database and calculate proxy metrics for sustainability.
CREATE DATABASE proxy_metrics_lab
proxy_metrics_lab
from the drop down menu.
S3 URL HERE
on line 160 of the query with the URI of the root of your cost and usage reports in the form of s3://<bucket>/<Report path prefix>/<Report name>/
.s3://<bucket>/cur-data/hourly/proxy-metrics-lab/proxy-metrics-lab/
.s3://<bucket>/cur-data/hourly/proxy-metrics-lab/
.CREATE EXTERNAL TABLE `cur_hourly`(
`identity_line_item_id` string,
`identity_time_interval` string,
`bill_invoice_id` string,
`bill_billing_entity` string,
`bill_bill_type` string,
`bill_payer_account_id` string,
`bill_billing_period_start_date` timestamp,
`bill_billing_period_end_date` timestamp,
`line_item_usage_account_id` string,
`line_item_line_item_type` string,
`line_item_usage_start_date` timestamp,
`line_item_usage_end_date` timestamp,
`line_item_product_code` string,
`line_item_usage_type` string,
`line_item_operation` string,
`line_item_availability_zone` string,
`line_item_resource_id` string,
`line_item_usage_amount` double,
`line_item_normalization_factor` double,
`line_item_normalized_usage_amount` double,
`line_item_currency_code` string,
`line_item_unblended_rate` string,
`line_item_unblended_cost` double,
`line_item_blended_rate` string,
`line_item_blended_cost` double,
`line_item_line_item_description` string,
`line_item_tax_type` string,
`line_item_legal_entity` string,
`product_product_name` string,
`product_activity_type` string,
`product_alarm_type` string,
`product_availability` string,
`product_capacitystatus` string,
`product_category` string,
`product_clock_speed` string,
`product_current_generation` string,
`product_data_type` string,
`product_datatransferout` string,
`product_dedicated_ebs_throughput` string,
`product_description` string,
`product_durability` string,
`product_ecu` string,
`product_edition` string,
`product_endpoint_type` string,
`product_enhanced_networking_supported` string,
`product_event_type` string,
`product_fee_code` string,
`product_fee_description` string,
`product_finding_group` string,
`product_finding_source` string,
`product_finding_storage` string,
`product_from_location` string,
`product_from_location_type` string,
`product_group` string,
`product_group_description` string,
`product_instance_family` string,
`product_instance_type` string,
`product_instance_type_family` string,
`product_intel_avx2_available` string,
`product_intel_avx_available` string,
`product_intel_turbo_available` string,
`product_license_model` string,
`product_location` string,
`product_location_type` string,
`product_logs_destination` string,
`product_logs_source` string,
`product_logs_type` string,
`product_max_iops_burst_performance` string,
`product_max_iopsvolume` string,
`product_max_throughputvolume` string,
`product_max_volume_size` string,
`product_maximum_extended_storage` string,
`product_memory` string,
`product_memory_gib` string,
`product_message_delivery_frequency` string,
`product_message_delivery_order` string,
`product_network_performance` string,
`product_normalization_size_factor` string,
`product_operating_system` string,
`product_operation` string,
`product_physical_processor` string,
`product_pre_installed_sw` string,
`product_processor_architecture` string,
`product_processor_features` string,
`product_product_family` string,
`product_protocol` string,
`product_queue_type` string,
`product_region` string,
`product_request_description` string,
`product_request_type` string,
`product_resource_price_group` string,
`product_routing_target` string,
`product_routing_type` string,
`product_servicecode` string,
`product_servicename` string,
`product_sku` string,
`product_standard_group` string,
`product_standard_storage` string,
`product_standard_storage_retention_included` string,
`product_storage` string,
`product_storage_class` string,
`product_storage_media` string,
`product_storage_type` string,
`product_subscription_type` string,
`product_tenancy` string,
`product_to_location` string,
`product_to_location_type` string,
`product_transfer_type` string,
`product_usagetype` string,
`product_vcpu` string,
`product_version` string,
`product_volume_api_name` string,
`product_volume_type` string,
`pricing_rate_id` string,
`pricing_currency` string,
`pricing_public_on_demand_cost` double,
`pricing_public_on_demand_rate` string,
`pricing_term` string,
`pricing_unit` string,
`reservation_amortized_upfront_cost_for_usage` double,
`reservation_amortized_upfront_fee_for_billing_period` double,
`reservation_effective_cost` double,
`reservation_end_time` string,
`reservation_modification_status` string,
`reservation_normalized_units_per_reservation` string,
`reservation_number_of_reservations` string,
`reservation_recurring_fee_for_usage` double,
`reservation_start_time` string,
`reservation_subscription_id` string,
`reservation_total_reserved_normalized_units` string,
`reservation_total_reserved_units` string,
`reservation_units_per_reservation` string,
`reservation_unused_amortized_upfront_fee_for_billing_period` double,
`reservation_unused_normalized_unit_quantity` double,
`reservation_unused_quantity` double,
`reservation_unused_recurring_fee` double,
`reservation_upfront_value` double,
`savings_plan_total_commitment_to_date` double,
`savings_plan_savings_plan_a_r_n` string,
`savings_plan_savings_plan_rate` double,
`savings_plan_used_commitment` double,
`savings_plan_savings_plan_effective_cost` double,
`savings_plan_amortized_upfront_commitment_for_billing_period` double,
`savings_plan_recurring_commitment_for_billing_period` double,
`resource_tags_aws_cloudformation_logical_id` string,
`resource_tags_aws_cloudformation_stack_id` string,
`resource_tags_aws_cloudformation_stack_name` string,
`resource_tags_user_name` string)
PARTITIONED BY (
`year` int,
`month` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'S3 URL HERE' <<< REPLACE ME
TBLPROPERTIES (
'classification'='parquet',
'has_encrypted_data'='false',
'projection.enabled'='true',
'projection.month.range'='1,12',
'projection.month.type'='integer',
'projection.year.range'='2018,2022',
'projection.year.type'='integer')
TBLPROPERTIES
clause), you do not load the partitions to query the data. Amazon Athena is clever enough to calculate them as long as they are in the range of Jan 2018 to December 2022.cur_hourly
table and choose Preview table.
Congratulations! You now can explore your CUR data with Amazon Athena with SQL queries. As soon as new CUR data gets written to Amazon S3, it will be returned in your next query immediately. Take the time to expirement with Amazon Athena’s query editor and explore the columns of your reports.
Now that you have completed this lab, make sure to update your Well-Architected review if you have implemented these changes in your workload.
Click here to access the Well-Architected Tool