All Projects → awslabs → Aws Data Api

awslabs / Aws Data Api

Licence: apache-2.0
AWS Data API's offer you the ability to replace traditional database back ends for your applications with simple HTTP API's. They offer the speed, scalability, reliability, and security of a sophisticated NOSQL platform, but with zero coding and no servers to manage.

Programming Languages

python
139335 projects - #7 most used programming language

AWS Data API's - Beta

AWS Data API's offer you the ability to replace traditional database back ends for your applications with simple HTTP API's. They offer the speed, scalability, reliability, and security of a sophisticated NOSQL platform, but with zero coding and no servers to manage. In seconds, you can create a new Data API Namespace that includes your data model, natural language search, and sophisticated data lineage tracking which is presented to your application as an HTTPS REST API.

Data API's can provide developers with a powerful new way to build applications, replacing complex database clusters, data modelling, and search integration with a simple to use, document oriented API. It also unifies your application data models with your data lake, allowing simple exports and direct queries gainst application data from the Glue Data Catalog and Amazon Athena.

Data API Features include:

  • Database Storage
    • Data API’s provide both structured or document type storage through DynamoDB
    • For a given ‘table’ (called a Namespace in Data API’s), you store Data Items as well as a separate set of Metadata associated with Data Items
    • Data & Metadata can have schemas applied, allowing for simple ‘flat’ RDBMS type tables, or sophisticated document models. You can choose whether you allow your application developers to extend these schemas.
    • Master Data Management features around ItemMaster reconciliation
    • Optimistic Concurrency Control is supported, and can be configured as required by an Administrator
    • Soft deletion that supports restoration is supported, as are ‘tombstone’ deletes in support of ‘right to be forgotten’ requirements
  • Flexible Queries
    • Data API’s provide native indexing of both Data and Metadata attributes
    • Reference graph searching
  • ElasticSearch Integration
    • You can supply an ElasticSearch cluster which will be used to automatically index all Data & Metadata, and augment the query API
  • Data Lake Integration
    • API Namespaces are automatically registered with AWS Glue, and you can write Athena queries against your Data
    • API Namespaces can be exported to Amazon S3 on demand
  • Streaming
    • Every Data and Metadata store is preconfigured with Dynamo Update Streams, and provides API integration to create streaming clients
  • Data Graph
    • You can supply a Gremlin endpoint, which enables arbitrary ‘References’ graphs which support data lineage tracking and any other type of data connections customers wish to make

Please click here for full Documentation.

FAQ

Q: Should Data API's be used to implement applications directly?

Probably not, but it's up to you. In general, we expect that Data API's are responsible for the core requirements of managing the data you would normally store in a relational or NOSQL database, and then you build your application logic and presentation layer on top. However, if your application simply needs the ability to create/read/update/delete data without a large amount of logic, Data API's may be suitable for direct implementation from a client side dynamic web app.

Q: Why wouldn't I use Data API's, and instead stick to my existing RDBMS architecture?

Data API's allow you to manage both structured and semi-structured data models, including where you require schema validation or not. Schema validation for Resources will allow you to implement the types of data models typically implemented on an RDBMS, but rarely for more than 3 tables. If your data model requires tens of interlinked, 3NF tables, then Data API's will not enable you to leverage the referential integrity checking that you get from an RDBMS. Also, RDBMS's allow to create Views and Triggers, modify behaviour with Stored procedures, and so on. A comparison table is provided:

Relational Feature Supported by Data API's?
Rows Yes. Data API's can also support lists and maps in a row, resulting in the ability to manage a document oriented structure. By adding a Schema that only supports scalar types, you have a direct analogue to a RDBMS row
Primary Keys Yes. Each Data API Namespace must have a Primary Key
Foreign Keys Sort of. By requiring that a "Parent" Item include a "Child" attribute with a given structure, you can create child to parent integrity checking, and making the "Child" attribute mandatory creates parent to child checks. You can also leverage References structures to link within and across Namespaces
Views Not directly. Can be implemented with Glue Catalog Views against the linked tables
Joins No, but you can use a Document structure to manage 1:M data structures that allow cross type queries
Column Filter Queries Yes, you can use the /find API to query any Namespace on any Attribute. If any Attribute is indexed, then the first index is used
Arbitrary SQL Not Directly. Available through Athena, in that each Data API Namespace/Stage is registered with the AWS Glue Catalog. Also, if you configure an ElasticSearch integration, you can use the full breadth of ElasticSearch syntax to query all Data API Namespaces.
Cost/Rule based optimiser No. Athena will implement cost based optimisation when querying Data API data
Triggers Can be implemented with AWS Lambda to create Triggers
Table Indexes Yes, Data API's support up to 5 indexes on each of the Resource or Metadata tables.
Data Dictionary Yes, supported through the /namespaces API, and for each Namespace, the ability to view the schema with /namespace/schema/<schema type>.
Grants Implemented through IAM security policies for both DynamoDB as well as OAuth2 support for API Gateway
ACID __A__tomicity: Yes. Each Data API modification is independently processed and completely processed, or not at all. __C__onsistency: No. Data API's don't support Transactions (yet) and all API requests are independently applied. However, you can implement pseudo-consistency by using non-relational nested object structures in Data API's, which are consistently applied to the top level Resource or Metadata. __I__solation: Not relevant because Data API's dont implement Transactions. __D__urability: Yes, all changes are persisted to at least 2 AWS Availability Zones before the API request completes

Q: Should I do data modelling like I used to?

Yes. It's important to model the data in your applications. However, Data API's offer you some important features which differ from an RDBMS which you should keep in mind. Firstly, data can be split between Resources and Metadata, which means that you can offer a structured, 3NF type Resource by applying a Schema, while letting developers store whatever they want in Metadata without a Schema. You may choose to apply a Schema to both Resources & Metadata, or not use Metadata at all. Alternatively, you may wish to not apply a Schema to either of these storage locations, and let your developers use the full flexibility of a Document oriented database. It's entirely up to you!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].