SupplyChain / Client / get_data_lake_dataset
get_data_lake_dataset¶
- SupplyChain.Client.get_data_lake_dataset(**kwargs)¶
- Enables you to programmatically view an Amazon Web Services Supply Chain data lake dataset. Developers can view the data lake dataset information such as namespace, schema, and so on for a given instance ID, namespace, and dataset name. - See also: AWS API Documentation - Request Syntax- response = client.get_data_lake_dataset( instanceId='string', namespace='string', name='string' ) - Parameters:
- instanceId (string) – - [REQUIRED] - The Amazon Web Services Supply Chain instance identifier. 
- namespace (string) – - [REQUIRED] - The namespace of the dataset, besides the custom defined namespace, every instance comes with below pre-defined namespaces: - asc - For information on the Amazon Web Services Supply Chain supported datasets see https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/data-model-asc.html. 
- default - For datasets with custom user-defined schemas. 
 
- name (string) – - [REQUIRED] - The name of the dataset. For asc namespace, the name must be one of the supported data entities under https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/data-model-asc.html. 
 
- Return type:
- dict 
- Returns:
- Response Syntax- { 'dataset': { 'instanceId': 'string', 'namespace': 'string', 'name': 'string', 'arn': 'string', 'schema': { 'name': 'string', 'fields': [ { 'name': 'string', 'type': 'INT'|'DOUBLE'|'STRING'|'TIMESTAMP'|'LONG', 'isRequired': True|False }, ], 'primaryKeys': [ { 'name': 'string' }, ] }, 'description': 'string', 'partitionSpec': { 'fields': [ { 'name': 'string', 'transform': { 'type': 'YEAR'|'MONTH'|'DAY'|'HOUR'|'IDENTITY' } }, ] }, 'createdTime': datetime(2015, 1, 1), 'lastModifiedTime': datetime(2015, 1, 1) } } - Response Structure- (dict) – - The response parameters for GetDataLakeDataset. - dataset (dict) – - The fetched dataset details. - instanceId (string) – - The Amazon Web Services Supply Chain instance identifier. 
- namespace (string) – - The namespace of the dataset, besides the custom defined namespace, every instance comes with below pre-defined namespaces: - asc - For information on the Amazon Web Services Supply Chain supported datasets see https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/data-model-asc.html. 
- default - For datasets with custom user-defined schemas. 
 
- name (string) – - The name of the dataset. For asc namespace, the name must be one of the supported data entities under https://docs.aws.amazon.com/aws-supply-chain/latest/userguide/data-model-asc.html. 
- arn (string) – - The arn of the dataset. 
- schema (dict) – - The schema of the dataset. - name (string) – - The name of the dataset schema. 
- fields (list) – - The list of field details of the dataset schema. - (dict) – - The dataset field details. - name (string) – - The dataset field name. 
- type (string) – - The dataset field type. 
- isRequired (boolean) – - Indicate if the field is required or not. 
 
 
- primaryKeys (list) – - The list of primary key fields for the dataset. Primary keys defined can help data ingestion methods to ensure data uniqueness: CreateDataIntegrationFlow’s dedupe strategy will leverage primary keys to perform records deduplication before write to dataset; SendDataIntegrationEvent’s UPSERT and DELETE can only work with dataset with primary keys. For more details, refer to those data ingestion documentations. - Note that defining primary keys does not necessarily mean the dataset cannot have duplicate records, duplicate records can still be ingested if CreateDataIntegrationFlow’s dedupe disabled or through SendDataIntegrationEvent’s APPEND operation. - (dict) – - The detail of the primary key field. - name (string) – - The name of the primary key field. 
 
 
 
- description (string) – - The description of the dataset. 
- partitionSpec (dict) – - The partition specification for a dataset. - fields (list) – - The fields on which to partition a dataset. The partitions will be applied hierarchically based on the order of this list. - (dict) – - The detail of the partition field. - name (string) – - The name of the partition field. 
- transform (dict) – - The transformation of the partition field. A transformation specifies how to partition on a given field. For example, with timestamp you can specify that you’d like to partition fields by day, e.g. data record with value 2025-01-03T00:00:00Z in partition field is in 2025-01-03 partition. Also noted that data record without any value in optional partition field is in NULL partition. - type (string) – - The type of partitioning transformation for this field. The available options are: - IDENTITY - Partitions data on a given field by its exact values. 
- YEAR - Partitions data on a timestamp field using year granularity. 
- MONTH - Partitions data on a timestamp field using month granularity. 
- DAY - Partitions data on a timestamp field using day granularity. 
- HOUR - Partitions data on a timestamp field using hour granularity. 
 
 
 
 
 
- createdTime (datetime) – - The creation time of the dataset. 
- lastModifiedTime (datetime) – - The last modified time of the dataset. 
 
 
 
 - Exceptions