Dart (and Flutter) library for the Traindown Markup Language. This is the reference implementation for now. It is first to receive features and fixes.

Stars: ✭ 16 (-93.87%)

Mutual labels: training

Wipro-PJP

Code written during Wipro PJP. 🍵📑

Stars: ✭ 60 (-77.01%)

Mutual labels: training

MaxibonKataKotlin

Maxibon kata for Kotlin Developers. The main goal is to practice property based testing.

Stars: ✭ 42 (-83.91%)

Mutual labels: training

curriculum-foundation

iSAQB Curriculum for the CPSA - Foundation Level. This repository contains copyrighted work.

Stars: ✭ 35 (-86.59%)

Mutual labels: training

model-zoo-old

The ONNX Model Zoo is a collection of pre-trained models for state of the art models in deep learning, available in the ONNX format

Stars: ✭ 38 (-85.44%)

Mutual labels: models

KataSuperHeroesIOS

Super heroes kata for iOS Developers. The main goal is to practice UI Testing.

Stars: ✭ 69 (-73.56%)

Mutual labels: training

diwa

A Deliberately Insecure Web Application

Stars: ✭ 32 (-87.74%)

Mutual labels: training

pydbantic

A single model for shaping, creating, accessing, storing data within a Database

Stars: ✭ 137 (-47.51%)

Mutual labels: models

condvis

Visualisation for statistical models.

Stars: ✭ 20 (-92.34%)

Mutual labels: models

View All Similar Projects ➔

bigscience

Research workshop on large language models - The Summer of Language Models 21

At the moment we have 2 code repos:

https://github.com/bigscience-workshop/Megatron-DeepSpeed - this is our flagship code base
https://github.com/bigscience-workshop/bigscience - (this repo) for everything else - docs, experiments, etc.

Currently, the most active segments of this repo are:

JZ - Lots of information about our work environment which helps evaluate, plan and get things done
Experiments - many experiments are being done. Documentation, result tables, scripts and logs are all there
Datasets info
Train - all the information about the current trainings (see below for the most important ones)

We have READMEs for specific aspects, such as:

hub integration

Trainings

While we keep detailed chronicles of experiments and findings for some of the main trainings, here is a doc that contains a summary of the most important findings: Lessons learned

Train 1 - 13B - unmodified Megatron gpt2 - baseline

the full spec and discussions
the training script
checkpoints and logs:
- tensorboard
- logs
chronicles

You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:

perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -sI $u]=~/content-length: (\d+)/; \
print qx[curl -sr $b-$e -L $u] if $e>$b; $b=$e; sleep 300}' \
https://huggingface.co/bigscience/tr1-13B-logs/resolve/main/main_log.txt

Train 3

Architecture and scaling baseline runs: no fancy tricks, just GPT2. Here are links to the respective tensorboards:

Size	1B3	760M	350M	125M
C4 + low warmup	a	b	c
OSCAR + low warmup	f
C4 + high warmup	e
OSCAR + high warmup	d (current baseline)	g	h	i
Pile + high warmup	m	j	k	l

Train 8

104B - unmodified Megatron gpt2 - with extra-wide hidden size to learn how to deal with training instabilities

the full spec and discussions
the training script
checkpoints and logs:
- tensorboard
- logs
chronicles

You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:

perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -sI $u]=~/content-length: (\d+)/; \
print qx[curl -sr $b-$e -L $u] if $e>$b; $b=$e; sleep 300}' \
https://cdn-lfs.huggingface.co/bigscience/tr8-104B-logs/b2cc478d5ae7c9ec937ea2db1d2fe09de593fa2ec38c171d6cc5dca094cd79f9

Train 11

This is the current main training

tr11-176B-ml

the full spec and discussions
the training script
checkpoints and logs:
- tensorboard
- logs
chronicles-prequel
chronicles

You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:

perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -LsI $u]=~/2 200.*?content-length: (\d+)/s; \
print qx[curl -Lsr $b-$e $u] if $e>$b; $b=$e; sleep 300}' \
https://huggingface.co/bigscience/tr11-176B-ml-logs/resolve/main/logs/main/main_log.txt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

bigscience-workshop / bigscience

Programming Languages

Labels

Projects that are alternatives of or similar to bigscience

bigscience

Trainings

Train 1 - 13B - unmodified Megatron gpt2 - baseline

Train 3

Train 8

Train 11