All Projects → condense9 → Hark Lang

condense9 / Hark Lang

Licence: apache-2.0
Build stateful and portable serverless applications without thinking about infrastructure.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Hark Lang

Baker
Orchestrate microservice-based process flows
Stars: ✭ 233 (+126.21%)
Mutual labels:  microservices, serverless, orchestration
Python Lambda
A toolkit for developing and deploying serverless Python code in AWS Lambda.
Stars: ✭ 1,247 (+1110.68%)
Mutual labels:  microservices, serverless, aws-lambda
Terrahub
Terraform Automation and Orchestration Tool (Open Source)
Stars: ✭ 148 (+43.69%)
Mutual labels:  serverless, infrastructure, orchestration
Lamlight
Lamlight is a command line tool to allow easy handling of AWS lambda functions. It allows to put heavy dependencies like numpy and scipy on AWS lambda and updating your lambda function very quickly.
Stars: ✭ 37 (-64.08%)
Mutual labels:  microservices, serverless, aws-lambda
Flogo
Project Flogo is an open source ecosystem of opinionated event-driven capabilities to simplify building efficient & modern serverless functions, microservices & edge apps.
Stars: ✭ 1,891 (+1735.92%)
Mutual labels:  microservices, serverless, aws-lambda
Up
Up focuses on deploying "vanilla" HTTP servers so there's nothing new to learn, just develop with your favorite existing frameworks such as Express, Koa, Django, Golang net/http or others.
Stars: ✭ 8,439 (+8093.2%)
Mutual labels:  microservices, serverless, aws-lambda
Tensorflow Lambda Layer
Lets you import Tensorflow + Keras from an AWS lambda
Stars: ✭ 79 (-23.3%)
Mutual labels:  serverless, aws-lambda
Docker In Aws Lambda
Run Docker containers in AWS Lambda
Stars: ✭ 82 (-20.39%)
Mutual labels:  serverless, aws-lambda
Lambdauth
A sample authentication service implemented with a server-less architecture, using AWS Lambda to host and execute the code and Amazon DynamoDB as persistent storage. This provides a cost-efficient solution that is scalable and highly available and can be used with Amazon Cognito for Developer Authenticated Identities.
Stars: ✭ 1,365 (+1225.24%)
Mutual labels:  serverless, aws-lambda
Mu
Framework to Run General-Purpose Parallel Computations on AWS Lambda
Stars: ✭ 85 (-17.48%)
Mutual labels:  serverless, aws-lambda
Aws Serverless Airline Booking
Airline Booking is a sample web application that provides Flight Search, Flight Payment, Flight Booking and Loyalty points including end-to-end testing, GraphQL and CI/CD. This web application was the theme of Build on Serverless Season 2 on AWS Twitch running from April 24th until end of August in 2019.
Stars: ✭ 1,290 (+1152.43%)
Mutual labels:  serverless, aws-lambda
Lambcycle
🐑🛵 A declarative lambda middleware with life cycle hooks 🐑🛵
Stars: ✭ 88 (-14.56%)
Mutual labels:  serverless, aws-lambda
Lambda Refarch Webapp
The Web Application reference architecture is a general-purpose, event-driven, web application back-end that uses AWS Lambda, Amazon API Gateway for its business logic. It also uses Amazon DynamoDB as its database and Amazon Cognito for user management. All static content is hosted using AWS Amplify Console.
Stars: ✭ 1,208 (+1072.82%)
Mutual labels:  serverless, aws-lambda
Hook.io
Open-Source Microservice Hosting Platform
Stars: ✭ 1,201 (+1066.02%)
Mutual labels:  microservices, serverless
Serverless Chat
A serverless web chat built using AWS Lambda, AWS IoT (for WebSockets) and Amazon DynamoDB
Stars: ✭ 99 (-3.88%)
Mutual labels:  serverless, aws-lambda
Discfg
A distributed, serverless, configuration tool using AWS services
Stars: ✭ 75 (-27.18%)
Mutual labels:  serverless, aws-lambda
Serverless Plugin Git Variables
⚡️ Expose git variables to serverless
Stars: ✭ 75 (-27.18%)
Mutual labels:  serverless, aws-lambda
Serverless With Next5 Boilerplate
Serverless.js with Next.js 5 on AWS, powered by the Serverless Framework
Stars: ✭ 100 (-2.91%)
Mutual labels:  serverless, aws-lambda
Jrestless
Run JAX-RS applications on AWS Lambda using Jersey. Supports Spring 4.x. The serverless framework can be used for deployment.
Stars: ✭ 93 (-9.71%)
Mutual labels:  serverless, aws-lambda
Cadence Python
Python framework for Cadence Workflow Service
Stars: ✭ 100 (-2.91%)
Mutual labels:  microservices, orchestration

The Hark Programming Language

Tests PyPI Code style: black Python 3.8

Hark lets you build serverless data pipelines in minutes, without managing any infrastructure.

Join the Slack workspace! We're talking about making cloud development easier.

Hark is for you if:

  • You use AWS.
  • You use Python for data engineering or business process pipelines.
  • You don't want to manage a task platform (Airflow, Celery, etc).

Key features:

  • First-class local testing (there's a local Hark runtime).
  • Concurrency primitives for multi-threaded pipelines.
  • Zero infrastructure management and minimal maintenance.

Quick start: Build an AWS Lambda pipeline in 2 minutes.

Comparisons:

  • Like Apache Airflow, but without infrastructure to manage.
  • Like AWS Step Functions but cloud-portable and locally testable.
  • Like Serverless Framework, but handles runtime glue logic in addition to deployment.

Status: Hark works well for small workflows: 5-10 Lambda invocations. Larger workflows may cause problems, and there is a known issue caused by DynamoDB restrictions (#12).

Documentation.

Hark was Presented at PyCon Africa 2020. Watch the presentation, or check out the demos.

Contributing

Hark is growing rapidly, and contributions are welcome.

Jump on slack to talk to us.

Is Hark for me?

Hark is for you if:

  • Your data is in AWS
  • You use Python for processing data, or writing business process workflows.
  • You don't want to deploy and manage a task platform (Airflow, Celery, etc).

Data in: You can invoke Hark like any Lambda function (AWS cli, S3 trigger, API gateway, etc).

Data out: Use the Python libraries you already have for database access. Hark just connects them together.

Development: Hark runs locally, so you can thoroughly test Hark programs before deployment (using minio and localstack for any additional infrastructure that your code uses.

Operating: Hark enables contextual cross-thread logging and stacktraces out of the box, since the entire application is described in one place.

Hark is like... But...
AWS Step Functions Hark programs aren't bound to AWS and don't use Step Functions under the hood (just plain Lambda + DynamoDB).
Orchestrators (Apache Airflow, etc) You don't have to manage infrastructure, or think in terms of DAGs, and you can test everything locally.
Task runners (Celery, etc) You don't have to manage infrastructure.
Azure Durable Functions While powerful, Durable Functions (subjectively) feel complex - their behaviour isn't always obvious.

Read more...

The 2 minute pipeline

All you need:

  • An AWS account, and AWS CLI configured.
  • A Python 3.8 virtual environment

Hark is built with Python, and distributed as a Python package. To install it, run in a new virtualenv:

pip install hark-lang

This gives you the hark executable. Try hark -h.

Initialise the project with a few template files:

hark init

Copy the following snippet into service.hk:

// Import the processing functions defined in Python
import(process_video_step1, src.video, 2);
import(process_video_step2, src.video, 3);
import(process_video_step3, src.video, 3);
import(process_video_final_step, src.video, 3);

// Process a named file
fn process_csv(key) {
  a = async process_video_step1(bucket, key);
  b = async process_video_step2(bucket, key, await a);
  c = async process_video_step3(bucket, key, await b);
  process_video_final_step(bucket, key, await c);
}

Run it locally to test:

hark service.hk -f on_upload filename.csv

And deploy the service to your AWS account (requires AWS credentials and AWS_DEFAULT_REGION to be defined):

hark deploy

Read more about what this actually creates.

Finally, invoke it in AWS (-f main is optional, as before):

hark invoke -f main your_bucket filename.csv

Read more...

Language Features

Concurrency & Synchronisation

This is useful when a set computations are related, and must be kept together.

/**
 * Return f(x) + g(x), computing f(x) and g(x) in parallel in two separate
 * threads (Lambda invocations in AWS).
 */
fn compute(x) {
  a = async f(x);     // Start computing f(x) in a new thread
  b = async g(x);     // Likewise with g(x)
  await a + await b;  // Stop this thread, and resume when {a, b} are ready
}

Traditional approach: Manually store intermediate results in an external database, and build the synchronisation logic into the cloud functions f and g, or use an orchestrator service.

Read more...

Trivial Pipelines

Use this approach when each individual function may take several minutes (and hence, together would break the 5 minute AWS Lambda limit).

/**
 * Compute f(g(h(x))), using a separate lambda invocation for each
 * function call.
 */
fn pipeline(x) {
  a = async h(x);
  b = async g(await a);
  f(await b);
}

Traditional approach: This is functionally similar to a "chain" of AWS Lambda functions and SQS queues.

Mapping / reducing

Hark functions are first-class, and can be passed around (closures and anonymous functions are planned, giving Hark object-oriented capabilities).

/**
 * Compute [f(element) for element in x], using a separate lambda invocation for
 * each application of f.
 */
fn map(f, x, accumulator) {
  if nullp(x) {
    accumulator
  }
  else {
    // The Hark compiler has tail-recursion optimisation
    map(func, rest(x), append(accumulator, async f(first(x))))
  }
}

This could be used like:

fn add2(x) {
  x + 2
}

fn main() {
  futures = map(add2, [1, 2, 3, 4], []);
  // ...
}

Read more...

Notes about syntax

The syntax should look familiar, but there are a couple of things to point out.

No 'return' statement

Every expression must return a value, so there is no return statement. The last expression in a 'block' (expressions between { and }) is returned implicitly.

fn foo() {
  "something"
}

fn main() {
  print(foo())  // -> prints "something"
}

Semi-colons are required...

... when there is more than one expression in a block.

This is ok:

fn main() {
  print("done")
}

So is this:

fn main() {
  print("one");
  print("two")
}

And this:

fn main() {
  print("one");
  print("two");
}

But this is not ok:

fn main() {
  print("one")  // <- missing semicolon!
  print("two")
}

'print' returns the value printed

In this snippet, "Hello Worlds!" is actually printed twice. First in bar, then in main.

fn bar() {
  print("Hello Worlds!")
}

fn main() {
  print(bar())
}
$> hark -q service.hk
Hello Worlds!
Hello Worlds!

'if' is an expression, and returns a value

Think about it like this: An if expression represents a choice between values.

v = if something { true_value } else { false_value };

// if 'something' is not true, v is set to null
v = if something { value };

FAQ

Why is this not a library/DSL in Python?

When Hark threads wait on a Future, they stop completely. The Lambda function saves the machine state and then terminates. When the Future resolves, the resolving thread restarts any waiting threads by invoking new Lambdas to pick up execution.

To achieve the same thing in Python, the framework would need to dump the entire Python VM state to disk, and then reload it at a later point -- this may be possible, but would certainly be non-trivial. An alternative approach would be to build a langauge on top of Python that looked similar to Python, but hark wrong because it was really faking things under the hood.

How is Hark like Go?

Goroutines are very lightweight, while Hark async functions are pretty heavy -- they involve creating a new Lambda (or process, when running locally).

Hark's concurrency model is similar to Go's, but channels are not fully implemented so data can only be sent to/from a thread at call/return points.

Is this an infrastructure-as-code tool?

No, Hark does not do general-purpose infrastructure management. There are already great tools to do that (Terraform, Pulumi, Serverless Framework, etc).

Instead, Hark reduces the amount of infrastructure you need. Instead of a distinct Lambda function for every piece of application logic, you only need the core Hark interpreter (purely serverless) infrastructure.

Hark will happily manage that infrastructure for you (through hark deploy and hark destroy), or you can set it up with your in-house custom system.

Current Limitations and Roadmap

Hark is beta quality, which means that it's not thoroughly tested or feature complete. This is a non-exhaustive list.

Libraries

Only one Hark program file is supported, but a module/package system is planned.

Error Handling

There's no error handling - if your function fails, you'll have to restart the whole process manually. An exception handling system is planned.

Typing

Function inputs and outputs aren't typed. This is a limitation, and will be fixed soon, probably using ProtoBufs as the interface definition language.

Calling Arbitrary Services

Currently you can only call Hark or Python functions -- arbitrary microservices can't be called. Before Hark v1.0 is released, this will be possible. You will be able to call a long-running third party service (e.g. an AWS ML service) as a normal Hark function and await on the result.


About

Hark is maintained by Condense9 Ltd.. Get in touch with [email protected] for help getting running, or if you need enterprise deployment.

Hark started because we couldn't find any data engineering tools that were productive and hark like software engineering. As an industry, we've spent decades growing a wealth of computer science knowledge, but building data pipelines in $IaC, or manually crafting workflow DAGs with $AutomationTool, just isn't software.

Join us on Slack.

Teal

Hark used to be called Teal.

Change your remotes if you checked out the previous repository:

git remote set-url origin [email protected]:condense9/hark-lang.git

License

Apache License (Version 2.0). See LICENSE for details.


forthebadge forthebadge forthebadge

The end. Here's a spaceship. Hacks and glory await.



                     `. ___
                    __,' __`.                _..----....____
        __...--.'``;.   ,.   ;``--..__     .'    ,-._    _.-'
  _..-''-------'   `'   `'   `'     O ``-''._   (,;') _,'
,'________________                          \`-._`-','
 `._              ```````````------...___   '-.._'-:
    ```--.._      ,.                     ````--...__\-.
            `.--. `-`                       ____    |  |`
              `. `.                       ,'`````.  ;  ;`
                `._`.        __________   `.      \'__/`
                   `-:._____/______/___/____`.     \  `
                               |       `._    `.    \
                               `._________`-.   `.   `.___
                                             SSt  `------'`
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].